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HIGH EFFICIENCY GENE TRANSFER AND EXPRESSION IN MAMMALIAN CELLS 
BY A MULTIPLE TRANSFECTION PROCEDURE OF MAR SEQUENCES 

Field of the invention 

5 

The present invention relates to purified and isolated DMA sequences having protein 
production increasing activity and more specifically to the use of matrix attachment 
regions (MARs) for increasing protein production activity in a eukaryotic cell. Also 
disclosed is a method for the identification of said active regions, in particular MAR 
1 0 nucleotide sequences, and the use of these characterized active MAR sequences in a 
new multiple transaction method. 

Background of the invention 

15 Nowadays, the model of loop domain organization of eukaryotic chromosomes is well 
accepted (Boulikas T, "Nature of DNA sequences at the attachment regions of genes to 
the nuclear matrix", J. Cell Biochem., 52:14-22, 1993). According to this model 
chromatin is organized in loops that span 50-100 kb attached to the nuclear matrix, a 
proteinaceous network made up of RNPs and other nonhistone proteins (Bode J, 

20 Stengert-lber M, Kay V, Schalke T and Dietz-Pfeilstetter A, Crit. Rev. Euk. Gene Exp., 
5:115-138,1996). 

The DNA regions attached to the nuclear matrix are termed SAR or MAR for 
respectively scaffold (during metaphase) or matrix (interphase) attachment regions 
25 (Hart, C, and Laemmli, U. (1998), "Facilitation of chromatin dynamics by SARs" Curr 
Opfn Genet Dev8, 519-525. ) 

As such, these regions may define boundaries of independent chromatin domains, 
such that only the encompassing cis-regulatory elements control the expression of the 
30 genes within the domain. 

However, their ability to fully shield a chromosomal locus from nearby chromatin 
elements, and thus confer position-independent gene expression, has not been seen in 
stably transfected cells (Poljak, L. Seurn, CMattiont, T.. and Laemmli, U. (1994) 
35 "SARs stimulate but do not confer position independent gene expression", Nucleic 

1 
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Acids Res 22, 4386-4394). On the other hand,. MAR sequences have been shown to 
interact with enhancers to increase focal chromatin accessibility (Jenuwein, T., 
Forrester, W., Fernandez-Herrero, L.. Laible, G., Dull, M., and Grosschedl, R. (1997) 
"Extension of chromatin accessibility by nuclear matrix attachment regions'' Nature 385, 
5 269-272). Specifically, MAR elements can enhance expression of heterologous genes 
in cell culture lines (Kalos, M., and Fournier, R. (1995) "Position-independent transgene 
expression mediated by boundary elements from the apolipoprotein B chromatin 
domain" Mol Cell Biol 15,198-207), transgenic mice (Castilla, J., Pintado, B.. Sola, I., 
Sanchez-Morgado, J., and Enjuanes, L (1998) "Engineering passive immunity in 

10 transgenic mice secreting virus-neutralizing antibodies in milk" Nat Biotechnol 16, 349- 
354) and plants (Allen, G., Hall, G. J., Michalowski, S., Newman. W., Spiker, S M 
Weissinger, A. T and Thompson, W. (1996), "High-level transgene expression in plant 
cells: effects of a strong scaffold attachment region from tobacco" Plant Celt 8, 899- 
913). The utility of MAR sequences for developing improved vectors for gene therapy is 

15 also recognized (Agarwal, M., Austin, T., Morel, F., Chen, J., Bohnlein, E., and Plavec.1. 
(1998), "Scaffold attachment region-mediated enhancement of retroviral vector 
expression in primary T cells" J Virol 72, 3720-3728). 

Recently, it has been shown that the chicken lysozyme 5' MAR Is able to significantly 

20 enhance reporter expression in pools of stable Chinese Hamster Ovary (CHO) cells 
(Zahn-Zabal, M., et al., "Development of stable cell lines for production or regulated 
expression using matrix attachment regions" J Biotechnol, 2001, 87(1): p. 29-42). This 
property was used to increase the proportion of high-producing clones, thus reducing 
the number of clones that need to be screened. These benefits have been observed 

25 both for constructs with MARs flanking the transgene expression cassette, as well as 
when constructs are co-transfected with the MAR on a separate plasmid. However, 
expression levels upon co-transfection with MARs were not as high as those observed 
for a construct in which two MARs delimit the transgene expression unit. Another 
limitation of this technique is the quantity of DNA that can be transfected per cell. 

30 Many multiples transfection protocols have been developed in order to achieve a high ( 
transfection efficiency to characterize the function of genes of interest. The protocol 
applied by Yamamoto et al, 1999 ("High efficiency gene transfer by multiple transfection 
protocol", Histochem. J. 31(4), 241-243) leads to a transfection efficiency of about 80 % 
after 5 transfections events, whereas the conventional transfection protocol only 

35 achieved a rate of <40%. While this technique may be useful when one wishes to 
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Increase the proportion of expressing cells, it does not lead to cells with a higher 
intrinsic productivity. Therefore, it cannot be used to generate high producer 
monoclonal cell lines. Hence, the previously described technique has two major 
drawbacks: 

5 i) this technique does not generate a homogenous population of transfected 
cells, since it cannot favour the integration of further gene copy, nor does it 
direct the transgenes to favorable chromosomal loci, 
ii) the use of the same selectable marker In multiple transfection events does 
not permit the selection of doubly or triply transfected cells. 

10 

In patent application WO02/074969, the utility of MARs for the development of stable 
eukaryotic cell lines has also been demonstrated. However, this application does not 
disclose neither any conserved homology for MAR DNA element nor any technique for 
predicting the ability for a DNA sequence to be a MAR sequence. 

15 

In fact no clear-cut MAR consensus sequence has been found (Boulikas T, "Nature of 
DNA sequences at the attachment regions of genes to the nuclear matrix", J. Ce// 
Biochem., 52:14-22, 1993) but evolutionary, the structure of these sequences seem to 
be functionally conserved in eukaryotic genomes, since animal MARs can bind to plant 
20 nuclear scaffolds and vice versa (Mielke C, Kohwi Y, Kohwi-Shigematsu T and Bode J, 
"Hierarchical binding of DNA fragments derived from scaffold-attached regions: 
correlation of properties in vitro and function in vivo", Biochemistry, 29:7475-7485, 
1990) . 

25 The identification of MARs by biochemical studies is a long and unpredictable process; 
various results can be obtained depending on the assay (Razin SV, "Functional 
architecture of chromosomal DNA domains", Crit Rev Eukaryot Gene Expr., 6:247-269, 
1 996). Considering the huge number of expected MARs in a eukaryotic genome and 
the amount of sequences issued from genome projects, a tool able to filter potential 

30 MARS in order to perform targeted experiments would be greatly useful. 

Currently two different predictive tools for MARs are available via the Internet. 
The fist one, MAR-Finder (http://futuresoft.org/MarFinder; Singh GB, Kramer JA and 
Krawetz SA, "Mathematical model to predict regions of chromatin attachment to the 
35 nuclear matrix", Nucleic Acid Research, 25:141 9-1425, 1997) is based on set of 

3 
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patterns identified within several MARs and a statistical analysis of the co-occurrence of 
these patterns. MAR-Finder predictions are dependent of the sequence context, 
meaning that predicted MARs depend on the context of the submitted sequence. The 
other predictive software. SMARTest (httpV/toww. genomatix.de; Frisch M, Freeh K, 

5 Klingenhoff A, Cartharius K, Llebich I and Werner T, "In silico prediction of 

scaffold/matrix attachment regions in large genomic sequences". Genome Research, 
12:349-354, 2001), use weight-matrices derived from experimentally identified MARs. 
SMARTest is said to be suitable to perform large-scale analyses. But actually aside its 
relative poor specificity, the amount of hypothetical MARs rapidly gets huge when doing 

10 large scale analyses with it, and in having no way to increase its specificity to restrain 
the number of hypothetical MARs, SMARTest becomes almost useless to plan 
efficiently wet-lab experiments. 

Some other softwares, not available via the Internet, also exists; they are based as well 
on the frequency of MAR motifs (MRS criterion;Van Drunen CM et al. T "A bipartite 

1 5 sequence element associated with matrix/scaffold attachment regions", Nucleic Acids 
Res, 27:2924-2930, 1999), (ChrClass; Glazko GV et al., "Comparative study and 
prediction of DNA fragments associated with various elements of the nuclear matrix", 
Biochim. Biophys. Acta, 1517:351-356, 2001) or based on the identification of sites of 
stress-induced DNA duplex (SIDD; Benham C and al., "Stress-induced duplex DNA 

20 destabilization in scaffold/matrix attachment regions", J. Mol. Biol., 274:181-1 96, 1997). 
However, their suitability to analyze complete genome sequences remains unknown, 
and whether these tools may allow the identification of protein production-increasing 
sequences has not been reported. 

25 All the above available predictive methods have some drawbacks that prevent large- 
scale analyses of genomes to identify reliably novel and potent MARs. 

SUMMARY OF THE INVENTION 

30 Therefore, the object of the present invention is to provide an improved method for the 
identification of DNA sequences having protein production increasing activity, in 
particular MAR nucleotide sequences, and the use of these characterized active MAR 
sequences in a new multiple transaction method to increase the production of 
recombinant proteins in eukaryotic cells. 

35 

4 
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Brief description of the Figures 

Fig. 1 shows the distribution plots of MARs and non-MARs sequences. Histograms are 
density plots (relative frequency divided by the bin width) relative to the score of the 
observed parameter. The density histogram for human MARs in the SMARt DB 
database is shown in black, while the density histogram for the human chromosome 22 
are in grey. 



5 



20 



Fig. 2 shows Scatterplots of the four different criteria used by SMAR Scan and the AT- 
10 content with human MARs from SMARt DB. 

Fig. 3 shows the distribution plots of MAR sequences by organism. MAR sequences 
from SMARt DB of other organisms were retrieved and analyzed. The MAR sequences 
density distributions for the mouse, the chicken, the sorghum bicolor and the human 
15 are plotted jointly. 

Fig. 4 shows SMAR Scan predictions on human chromosome 22 and onshuffled 
chromosome 22. Left plot corresponds to the number of hits obtained by SMAR Scan 
when analyzing crumbled, scrambled and native chromosome 22. Right plot represents 
the number of S/MARs predicted by SMAR Scan in crumbled, scrambled and native 
chromosome 22. 

Fig. 5 shows the dissection of the ability of the chicken lysozyme gene 5'-MAR to 
stimulate transgene expression in CHO-DG44 cells. Fragments B. K and F show the 
highest ability to stimulate transgene expression. The indicated relative strength of the 
elements was based on the number of high-expressor cells. 

Fig. 6 shows the effect of serial-deletions of the 5'-end (upper part) and the 3'-end 
(lower part) of the 5'-MAR on the loss of ability to stimulate transgene expression The 
transition from increased to decreased activity coincide with B-, K- and F-fragments. 

Fig. 7 shows that portions of the F fragment significantly stimulate transgene 
expression. The F fragment regions indicated by the light grey arrow were multimerized 
.nserted in pGEGFP Control and transfected in CHO cells. The element that displays ' 
the highest activity is located in the central part of the element and corresponds to 



25 



30 
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fragment Fill (black bar labelled minimal MAR). In addition, an enhancer activity is 
located in the 3'-flanking part of the Fill fragment (dark grey bar labelled MAR 
enhancer). 

5 Fig. 8 shows a map of locations for various DNA sequence motifs within the cLysMAR. 
Fig. 8 (B) represents a Map of locations for various DNA sequence motifs within the 
cLysMAR. Vertical lines represent the position of the computer-predicted sites or 
sequence motifs along the 3034 base pairs of the cLysMAR and ttsiactlve regions, as 
presented in Fig. 5. The putative transcription factor sites, (MEF2 05, Oct-1, USF-02. 
1 0 GATA, NFAT) for activators and (CDP. SATB1 , CTCF, ARBP/MeCI? 2) for repressors of 
transcription, were identified using Matlnspector (Genomatix), and CpG Islands were 
Identifed with CPGPLOT. Motifs previously associated with MAR elements are labelled 
in black and include CpG dinucleotldes and CpG islands, unwinding; motifs (AATATATT 
and AATATT), poly As and Ts, poly Gs and Cs, Drosophila topolsomerase II binding 
15 sites (GTNWAYATTNATTNATNNR) which had identity to the 6 bp core and High 
mobility group I (HMG-I/Y) protein binding sites. Other structural mojifs labelled in 
include nucleosome-binding and nucleosome disfavouring sHes and a motif thought to 
relieve the superhelical strand of DNA. Fig. 8(A) represents the comparison of the 
ability of portions of the cLysMAR to activate transcription with MAR prediction score 
20 profiles with MarFinder. The top diagram shows the MAR fragment activity as in Fig. 5, 
while the middle and bottom curves show MARFinder-predicted potential for MAR 
activity and for bent DNA structures respectively. 

Fig. 9 shows the Correlation of DNA physico-chemical properties with MAR activity. Fig. 

25 9(A), represents the DNA melting temperature, double helix bending, major groove 
depth and minor groove width profiles of the 5-MAR and.were determined using the 
algorithms of Levitsky et al. The most active B, K and F fragments depicted at the top 
are as shown as in Figure 1. Fig. 9(B), represents the enlargement bf the data 
presented in panel A to display the F fragment map aligned with the tracings 

30 corresponding to the melting temperature (top curve) and DNA bending (bottom curve). 
The position of the most active FIB fragment and protein binding site for specific 
transcription factors are as indicated. 
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Fig. 1 0 shows the distribution of putative transcription factor binding) sites within the 5'- 
MAR. Large arrows indicate the position of the CUE elements as identified with 
SMARscan. 

5 Fig. 1 1 shows the scheme of assembly of various portions of the mAr. The indicated 
portions of the cLysMAR were amplified by PCR, introducing Bglll-BamHI linker 
elements at each extremity, and assembled to generate the depicted composite 
elements. For instance, the top construct consists of the assembly 6f all CUE and 
flanking sequences at their original location except that Bgll-BamNl/ linker sequences 
10 separate each element. * , 

Fig. 12 represents the plasmid maps. 

Fig. 13 shows the effect of re-transfecting primary transfectants on GFP expression. 

1 5 Cells (CHO-DG44) were co-transfected with pSV40EGFP (left tube)! or pMAR- 

SV40EGFP (central tube) and pSVneo as resistance plasmid. Cellstransfected with 
PMAR-SV40EGFP were re-transfected 24 hours later with the same plasmid and a 
different selection plasmid, pSVpuro (right tube). After two weeks -selection, the 
phenotype of the stably transfected cell population was analysed byi FACS. 

20 ! ' 

t 

Fig. 14 shows the effect of multiple load of MAR-containing plasmjdi The pMAR- 
SV40EGFP/ pMAR-5V40EGFP secondary transfectants were usedjin a third cycle of 
transfection at the end of the selection process. The tertiary transfection was 
accomplished with pMAR or pMAR-SV40EGFP to give tertiary trahslfectants. After 24 
25 hours, cells were transfected again with either plasmid, resulting in the quaternary 
transfectants (see Table 4). : 

Detailed description of the Invention 

30 The present invention relates to a purified and isolated DNA sequence having protein 
production increasing activity characterized in that said DNA sequence comprises at 
least one bent DNA element, and at least one binding site for a DNA 1 binding protein. 

i 

t 

Certain sequences of DNA are known to form a relatively "static curve", where the DNA 
35 follows a particular 3-dimensional path. Thus, instead of just being in the normal B- 

X 
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DNA conformation ("straight"), the piece of DNA can form a flat, planar curve also 
defined as bent DNA (Marini. et aL, 1982 "Bent helical structure in liinetoplast DNA", 
Prvc. Natl. Acad. Sci. USA, 79: 7664-7664). \ 

5 According to the present invention, the bent DNA element is usually a MAR nucleotide 
sequence selected from the group comprising the sequences SEQ ID Nos 1 to 23 or a 
cLysMAR element or a fragment thereof. Preferably, the bent DNA Element is a MAR 
nucleotide sequence selected from the group comprising the sequences SEQ ID Nos 1 
to 23, more preferably the sequences SEQ ID Nos 21 to 23. 

10 ; 

Encompassed by the present invention are as well complementary sequences of the 
above-mentioned sequences SEQ ID Nos 1 to 23 and the cLysMAjfc element or 
fragment, which can be produced by using PCR 

15 An element is a conserved nucleotide sequences that bears commdm functional 
properties (i.e. binding sites for transcription factors) or structural |(i.te. bent DNA 
sequence) features. ; i 

A part of sequences SEQ ID Nos 1 to 23 and the cLysMAR element or fragment refers 
20 to sequences sharing at least 70% nucleotides in length with the respective sequence 
of the SEQ ID Nos 1 to 23. These sequences can be used as \ot)q las they exhibit the 
same properties as the native sequence from which they derive, preferably these 
sequences share more than 80%, in particular more than 90% nucleotides fn length 
with the respective sequence of .the SEQ ID Nos 1 to 23. 

25 ' ; ; • 

The present invention also includes variants of the aforementioned Sequences SEQ ID 
Nos 1 to 23 and the cLysMAR element or fragment, that is nucleotide sequences that 
vary from the reference sequence by conservative nucleotide substitutions, whereby 
one or more nucleotides are substituted by another with same characteristics. 

30 

The sequences SEQ ID Nos 1 to 23 have been identified by scamrrihg human 
chromosome 1 and 2 using SMAR Scan, showing that the identification of novel MAR 
sequences is feasible using the tools reported thereafter. « '• 
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20 
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30 



In a first step, the complete chromosome 1 and 2 were screened 
element as region corresponding to the highest bent, major grooyi 
width and lowest melting temperature as in figure 3. In a second s 
sequence was scanned for binding sites of regulatory proteins sue 
etc. as shown in the figure 8B) yielding sequences SEQ ID 1-23. 
sequences 21-23 were further shown to be located next to knowri 
Human Genome Data Base. 



5. 



itb identify bent DNA 

depth, minor groove 
step, this collection of 
as SATB1, GATA, 
F l> rthermore, 
gsnefrom the 



Molecular chimera of MAR sequences are also considered in the; 
molecular chimera is intended a nucleotide sequence that may intifilide 
portion of a MAR element and that will be obtained by molecular h 
known by those skilled in the art. 



present invention. By 

a functional 
fology methods 



Particular combinations of MAR elements or fragments or 
considered in the present invention. These fragments can be prepkk-ed 
methods known in the art, These methods include, but are not lirriliiecf 
restriction enzymes and recovery of the fragments, chemical 
chain reactions (PCR). 

Therefore, particular combinations of elements or fragments of this 
Nos 1 to 23 and cLysMAR elements or fragments are also envisibnfed 
invention, depending on the functional results to be obtained 
are e.g. the B, K and F regions as described in WO 02/074969, 
is hereby incorporated herein by reference, in its entirety. The 
cLysMAR used in the present invention are the B, K and F regions 
might be used or multiple copies of the same or distinct elements 
elements) might be used (see Fig. 8 A)). 



sub-portions thereof are also 
by a variety of 
to, digestion with 
synthesis or polymerase 



secjJience^ 



By fragment is intended a portion of the respective nucleotide 
a MAR nucleotide sequence may retain biological activity and herite 
nuclear matrices and/or alter the expression patterns of coding s 
linked to a promoter. Fragments of a MAR nucleotide sequence 
least about 100 to 1000 bp, preferably from about 200 to 700 bp, 
about 300 to 500 bp nucleotides. Also envisioned are any combindflons 
which have the same number of nucleotides present in a synthetic 



Elements 



preferred 



sequences SEQ ID 
in the present 
of the cLysMAR 
disclosure of which 
elements of the 
Only one element 
muftimerized 



Fragments of 
bind to purified 
ejdfiliences operably 
range from at 
fWbre preferably from 
of fragments, 
W/IAR sequence 
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consisting of natural MAR element and/or fragments. The fragments are preferably 
assembled by linker sequences. Preferred linkers are Bgll-BamHI linker. 

"Protein production increasing activity" refers to an activity of the purified and isolated 
5 DNA sequence defined as follows: after having been introduced Under suitable 
conditions into a eukaryotic host cell, the sequence is capable of increasing; protein 
production levels In cell culture as compared to a culture of cell transfected without said 
DNA sequence. Usually the increase is 1 .5 to 6 fold, preferably 4 So 6 fd!ld. This 
corresponds to a production rate or a specific cellular productivity: of at least 10 pg per 
10 cell per day (see Example 11 and Fig.13). 

As used herein, the following definitions are supplied in order to facilitate the 
understanding of this case. To the extent that the definitions vary from meanings 
circulating within the art, the definitions below are to control. 

15 

"Chromatin" is the nucleic acid material having the chromosomesjdf a eukaryotic cell, 
and refers to DNA, RNA and associated proteins. 

A "chromatin element" means a nucleic acid sequence on a chromosome. 

20 

"Cis" refers to the placement of two or more elements (such as cttrbmatin elements) on 
the same nucleic acid molecule (such as the same vector or chrofnosome). 

"Trans" refers to the placement of two or more elements (such as: chromatin elements) 
25 on two or more different nucleic acid molecules (such as on two vectors or two 
chromosomes). 

Chromatin modifying elements that are potentially capable of overcoming position 
effects, and hence are of interest for the development of stable c£ll lineS, include 
30 boundary elements (BEs), matrix attachment regions (MARs), locii$ control regions 
(LCRs), and universal chromatin opening elements (UCOEs). 

Boundary elements ("BEs"), or insulator elements, define boundaries in Chromatin in 
many cases (Bell, A., and Feteenfeld, G. 1999; "Stopped at the bdrder: boundaries and 
35 insulators, Curr Opin Genet Dev 9, 191-198) and may play a role in! defining a 
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I I 

transcriptional domain in vivo. BEs lack intrinsic promoter/enhancer activity! but rather 
are thought to protect genes from the transcriptional Influence of regulatory! elements in 
the surrounding chromatin. The enhancer-block assay is commonly used td identify 
insulator elements. In this assay, the chromatin element is placed betweenian enhancer 
and a promoter, and enhancer-activated transcription is measure*. Boundary elements 
have been shown to be able to protect stably transfected reporter genes against 
position effects in Drosophila, yeast and in mammalian cells. They have also been 
shown to increase the proportion of transgenic mice with inducible; transgene 
expression. 



10 



15 



20 



Locus control regions ("LCRs") are cis-regulatory elements inquired for theSinitial 
chromatin activation of a locus and subsequent gene transcription^ their native 
locations (Grosveld, F. 1999, "Activation by locus control regions?" OurrQ&n Genet 
Dev 9, 1 52-157). The activating function of LCRs also allows the ! dxpressidh of a 
coupled transgene in the appropriate tissue in transgenic mice, irrespective" of the site 
of integration in the host genome. While LCRs generally confer tfesue-spedific levels of 
expression on linked genes, efficient expression in nearly all tissti.es in transgenic mice 
has been reported for a truncated human T-cell receptor LCR anda rat LAP LCR The 
most extensively characterized LCR is that of the globin locus. Its use in vectors for the 
gene therapy of sickle cell disease and (3-thalassemias is currency being evaluated. 

» 

Ubiquitous chromatin opening elements ("UCOEs". also known as>ubiquftdusly-acting 
chromatin opening elements") have been reported in WO 00/0539*3. j 

An "enhancer is a nucleotide sequence that acts to potentiate the 1 transcription of 
genes independent of the identity of the gene, the position of the Sequence in relation 
to the gene, or the orientation of the sequence. The vectors of ^present invention 
optionally include enhancers. 

i 

A "gene" is a deoxyribonucleotide (DNA) sequence coding for a given matUre protein 
As used herein, the term "gene" shall not include untranslated fldWking regions such as 

trans °ription initiation signals, polyadenylation addition sites, ipromoters or 
enhancers. 

35 A "product gene" is a gene that encodes a protein product having desirable! 

i 

11 
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characteristics such as diagnostic or therapeutic utility. A product gene includes, e. g M 
structural genes and regulatory genes. 

A "structural gene" refers to a gene that encodes a structural protein. Examples of 
5 structural genes include but are not limited to, cytoskeletal proteins, extracellular matrix 
proteins, enzymes, nuclear pore proteins and nuclear scaffold proteins, ion channels 
and transporters, contractile proteins, and chaperones. Preferred structural genes 
encode for antibodies or antibody fragments. 

10 A "regulatory gene" refers to a gene that encodes a regulatory protein. Examples of 
regulatory proteins include, but are not limited to, transcription factors, hormones, 
growth factors, cytokines, signal transduction molecules, oncogenes, proto-oncogenes, 
transmembrane receptors, and protein kinases. 

15 "Orientation" refers to the order of nucleotides in a given DNA sequence. For example, 
an inverted orientation of a DNA sequence is one in which the 5 f to 3' order of the 
sequence In relation to another sequence is reversed when compared to a point of 
reference in the DNA from which the sequence was obtained. Such reference points 
can include the direction of transcription of other specified DNA sequences in the 

20 source DNA and/or the origin of replication of replicable vectors containing the 
sequence. 

"Eukaryotic ceir refers to any mammalian or non-mammalian cell from a eukaryotic 
organism. By way of non-limiting example, any eukaryotic cell that is capable of being 
25 maintained under cell culture conditions and subsequently transfected would be 
included in this invention. Especially preferable cell types include, e. g., stem cells, 
embryonic stem cells, Chinese hamster ovary cells (CHO) f COS, BHK21, NIH3T3, 
HeLa, C2C12, cancer cells, and primary differentiated or undifferentiated cells. Other 
suitable host cells are known to those skilled in the art. 

30 

The terms "host celP and "recombinant host cell" are used interchangeably herein to 
indicate a eukaryotic cell into which one or more vectors of the invention have been 
introduced. It is understood that such terms refer not only to the particular subject cell 
but also to the progeny or potential progeny of such a cell. Because certain 
35 modifications may occur In succeeding generations due to either mutation or 
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environmental Influences, such progeny may not. in fact, be identical to the parent cell, 
but are still included within the scope of the term as used herein. 

The terms "introducing a purified DNA into a euKaryotic host cell" or "transfection" 
denote any process wherein an extracellular DNA, with or without accompanying 
material, enters a host cell. The term "cell transfected" or "transfected cell" means the 
cell into which the extracellular DNA has been introduced and thus harbours the 
extracellular DNA. The DNA might be introduced into the cell so that the nucleic acid is 
replicable either as a chromosomal integrant or as an extra chromosomal element. 

"Promoter" as used herein refers to a nucleic acid sequence that regulates expression 
of a gene. 

"Co-transfection" means the process of transfecting a euKaryotic cell with more than 
one exogenous gene foreign to the cell, one of which may confer a selectable 
phenotype on the cell. 



The purified and isolated DNA sequence having protein production increasing activity 
also comprises, besides one or more bent DNA element, at least one binding site for a 
20 DNA binding protein. 

Usually the DNA binding protein is a transcription factor. Examples of transcription 
factors are the group comprising the polyQpolyP domain proteins. 
Another example of a transcription factor is a transcription factor selected from the 
group comprising SATBl, NMP4, MEF2, S8, DLX1, FREAC7, BRN2, GATA 1/3, TATA, 
Bright, MSX or a combination of two or more thereof. PolyQpolyP domain proteins and 
SATB1 , NMP4, MEF2, S8. DLX1 , FREAC7, BRN2, GATA 1/3, TATA, Bright, MSX or a 
combination of two or more of these transcription factors are preferred. Most preferred 
are 5ATB1, NMP4, MEF2 and polyQpolyP domain proteins. 

SATB1, NMP4 and MEF2, for example, are known to regulate the development and/or 
tissue-specific gene expression in mammals. These transcription factors have the 
capacity to after DNA geometry, and reciprocally, binding to DNA as an allosteric ligand 
modifies their structure. Recently, SATB1 was found to form a cage-like structure 
circumscribing heterochromatin (Cai, S.. H.J. Han. and T. Kohwi-Shigematsu Tissue- 
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specific nuclear architecture and gene expression regulated by SATB1° Nat Genet, 
2003. 34(1): p. 42-51). 

A further aspect of the invention is to provide a method for identifying a MAR sequence 
5 using a Bioinformatic tool comprising the computing of values of one or more DNA 
sequence features corresponding to DNA bending, major groove depth and minor 
groove width potentials, melting temperature. Preferably, the identification of one or 
more DNA sequence features further comprises a further DNA sequence feature 
corresponding to binding sites for DNA binding proteins, which is also computed with 
10 this method. 

The bioinformatic tool used for the present method is preferably, SMAR Scan, which 
contains algorithms developed by Gene Express (http://srs6. bionet.nsc.ru/srs6bin/cgi- 
bln/wqetz? -e+rFEATURES-5itelD:'nR']) and based on Levitsky et a/., 1999. These 
1 5 algorithms recognise profiles, based on dinudeotides weight-matrices, to compute the 
theoretical values for conformational and physicochemical properties of DNA. 

Preferably, SMAR Scan uses the four theoretical criteria also designated as DNA 
sequence features corresponding to DNA bending, major groove depth and minor 
20 groove width potentials, melting temperature in all possible combination, using scanning 
windows of variable size (see Fig. 3). For each function used, a cut-off value has to be 
set The program returns a hit every time the computed score of a given region Is above 
the set cut-off value for all of the chosen criteria. Two data output modes are available 
to handle the hits, the first (called "profile-like") simply returns all hit positions on the 
25 query sequence and their corresponding values for the different criteria chosen. The 
second mode (called "contiguous hits ") returns only the positions of several 
contiguous hits and their corresponding sequence. For this mode, the minimum number 
of contiguous hits is another cut-off value that can be set, again with a tunable window 
size. This second mode is the default mode of SMAR Scan. Indeed, from a semantic 
30 point of view, a hit Is considered as a core-unwinding element (CUE), and a cluster of 
CUEs accompanied by clusters of binding sites for relevant proteins is considered as a 
MAR. Thus. SMAR Scan considers only several contiguous hits as a potential MAR. 

To tune the default cut-off values for the four theoretical structural criteria, 
35 experimentally validated MARs from SMARt DB (http://transfac.gbf.de/- SMARt DB) 
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were used. All the human MAR sequences from the database were retrieved and 
analyzed with SMAR Scan using the "profile-like" mode with the four criteria and with 
no set cut-off value. This allowed the setting of each function for every position of the 
sequences. The distribution for each criterion was then computed according to these 
5 data (see Fig. 1 and 3). 

The default cut-off values of SMAR can for the bend, the major groove depth and the 
minor groove width were set at the average of the 75th quantile and the median. For 
the melting temperature, the default cut-off value should be set at the 75th quantile. 
1 0 The minimum length for the "contiguous-hits" mode should be set to 300 because it is 
assumed to be the minimum length of a MAR (see Fig. 8 and 9). However, one skilled 
in the art would be able to determine the cut-off values for the above-mentioned criteria 
for a given organism with minimal experimentation. 



15 



Preferably, DNA bending values are comprised between 3 to 5 • (radial degree) Most 
preferably are situated between 3.8 to 4.4 corresponding to the smallest peak of Fig 
1. 



20 



25 



30 



35 



Preferably the major groove depth values are comprised between 8.9 to 9.3 A 
(Angstrom) and minor groove width values between 5.2 to 5.8 A. Most preferably the 
major groove depth values are comprised between 9.0 to 9.2 A and minor groove width 
values between 5.4 to 5.7 A. 

Preferably the melting temperature is situated between 55 to 75 • C (Celsius degree) 
Most preferably the melting temperature is comprised between 55 to 82 • C. 

The DNA binding protein of which values can be computed by the method is usually a 
transcription factor preferably a polyQpolyP domain or a transcription factor selected 
from the group comprising SATB1, NMP4, MEF2. SB. DLX1, FREAC7 BRN2 GATA 
1/3, TATA, Bright, MSX or a combination of two or more of these transcription factors 
However, one skilled in the art would be able to determine other kinds of transcription' 
factors ,n order to cany out the method according to the present invention. 

The present invention also encompasses the use of purified and isolated DNA 
sequence comprising a first isolated matrix attachment region (MAR) nucleotide 
sequence which is a MAR nucleotide sequence selected from the group comprising the 
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sequences SEQ ID Nbs 1 to 23, a sequence complementary thereof, a part thereof 
sharing at least 70% nucleotides in length, a molecular chimera thereof, a combination 
thereof and variants or a MAR nucleotide sequence of a cLysMAR element and/or 
fragment, a sequence complementary thereof, a part thereof sharing at least 70% 
5 nucleotides in length, a molecular chimera thereof, a combination thereof and variants 
for increasing protein production activity in a eukaryotic host cell. 

Said purified and isolated DNA sequence usually further comprises one or more 
regulatory sequences, as known in the art e.g. a promoter and/or an enhancer, 
1 0 polyadenylation sites and splice junctions usually employed for the expression of the 
protein or may optionally encode a selectable marker. Preferably said purified and 
isolated DNA sequence comprises a promoter which is operably linked to a gene of 
interest. 

1 5 The DNA sequences of this invention can be isolated according to standard PCR 
protocols and methods well known in the art. 

Promoters which can be used provided that such promoters are compatible with the 
host cell are, for example, promoters obtained from the genomes of viruses such as 

20 polyoma virus, adenovirus (such as Adenovirus 2), papilloma virus (such as bovine 
papilloma virus), avian sarcoma virus, cytomegalovirus (such as murine or human 
cytomegalovirus immediate early promoter), a retrovirus, hepatitis-B virus, and Simian 
Virus 40 (such as SV 40 early and late promoters) or promoters obtained from 
heterologous mammalian promoters, such as the actin promoter or an immunoglobulin 

25 promoter or heat shock promoters. Such regulatory sequences direct constitutive 
expression. 

Furthermore, the purified and isolated DNA sequence might further comprise regulatory 
sequences which are capable of directing expression of the nucleic acid preferentially in 

30 a particular cell type (e. g., tissue-specific regulatory elements are used to express the 
nucleic acid). Tissue-specific regulatory elements are known in the art. Non-limiting 
examples of suitable tissue-specific promoters include the albumin promoter (liver- 
specific; Pinkert,etal., 1987. Genes Dev.1: 268-277), lymphoid-specrfic promoters 
(Calame and Eaton, 1988. Adv. Immunol. 43: 235-275), in particular promoters of T cell 

35 receptors (Winoto and Baltimore, 1989. EMBOJ. 8: 729-733) and immunoglobulins 
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(Banerji. etal., 1983. Cell 33: 729-740; Queen and Baltimore, 1983. Cell 33:7f41-74 8 ), 
neuron-specific promoters (e. g., the neurofilament promotenByrne and Ruddle. 1989 
ProcNatl. Acad. Sci. USA 86: 5473-5477), pancreas-specific promoters (Edlund, et al 
1985. Science 230: 912-916), and mammary gland-specific promoters (e. g., milk whey 
5 promoter: U. S. Pat. No. 4,873,316 and European Application No. 264,166). . 

Developmentally-regulated promoters are also encompassed. Examples of such 
promoters include, e.g., the murine hox promoters (Kessel and Gruss. 1990,-Science 
249: 374-379) and thea-fetoprotein promoter (Campes and Tilghman. 1989 Genes 
10 Dev. 3: 537-546). 

Regulatable gene expression promoters are well known in the art, and include, by way 
of non-limiting example, any promoter that modulates expression of a gene encoding a 
desired protein by binding an exogenous molecule, such as the CRE/LOX system, the 
TET system, the NFkappaB/UV light system, the Leu3p/lsopropylmalate system, and 
theGLVPc/GAL4 system (See e. g., Sauer, 1998, Methods 14 (4): 381-92 • Lfewandoski 
2001, Nat. Rev. Genet 2 (10): 743-55; Legrand-Poels et al., 1998, J. PhotocHem 
Photobiol. B. 45: 18; Quo et al., 1996, FEBS Lett. 390 (2): 191-5; Wang et al;. PNAS 
USA, 1999,96 (15): 84838). 

However, one skilled in the art would be able to determine other kinds of promoters that 
are suitable in carrying out the present invention. ! 

Enhancers can be optionally included the purified DNA sequence of the invention then 
belonging to the regulatory sequence, e.g. the promoter. 

The gene of interest preferably encodes a protein (structural or regulatory protein) As 
used herein "protein" refers generally to peptides and polypeptides having more than 
about ten amino acids. The proteins may be "homologous" to the host (i.e , 
endogenous to the host cell being utilized), or "heterologous," (i.e.. foreign to ithe host 
cell being utilized), such as a human protein produced by yeast. The protein may be 
produced as an insoluble aggregate or as a soluble protein in the periplasm ic space or 
cytoplasm of the cell, or in the extracellular medium. Examples of proteins include 
hormones such as growth hormone, growth factors such as epidermal growth factor 
analgesic substances like enkephalin, enzymes like chymotrypsin, and receptors to ' 
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hormones or growth factors and includes as well proteins usually used as a visualizing 
marker e.g. green fluorescent protein. 

preferably the purified DNA sequence further comprises at least a second isolated 
5 matrix attachment region (MAR) nucleotide sequence selected from the group 
comprising the sequences SEQ ID Nos 1 to 23 or cLysMAR. a sequence 
complementary thereof, a part thereof sharing at least 70% nucleotides in length, a 
molecular chimera thereof, a combination thereof and variants. The isolated matrix 
attachment region (MAR) nucleotide sequence might be identical or different. 
10 Alternatively, a first and a second identical MAR nucleotide sequence are used. 

Preferably, the MAR nucleotide sequences are located at both the 5' and the 3' ends of 
the sequence containing the promoter and the gene of interest But the invention also 
envisions the fact that said first and or at least second MAR nucleotide sequences are 
15 located on a sequence distinct from the one containing the promoter and the gene of 
interest. 

Embraced by the scope of the present invention is also the purified and isolated DNA 
sequence comprising a first isolated matrix attachment region (MAR) nucleotide 

20 sequence which is a MAR nucleotide sequence selected from the group comprising the 
sequences SEQ ID Nos 1 to 23, a sequence complementary thereof, a part thereof 
sharing at least 70% nucleotides in length, a molecular chimera thereof, a combination 
thereof and variants or a MAR nucleotide sequence of a cLysMAR element and/or 
fragment, a sequence complementary thereof, a part thereof sharing at least 70% 

25 nucleotides in length, a molecular chimera thereof, a combination thereof and variants 
can be used for increasing protein production activity in a eukaryotic host cell by 
introducing the purified DNA sequence into a eukaryotic host cell according to well 
known protocols. Usually applied methods for introducing DNA into eukaryotic host cells 
usually applied are e.g. direct introduction of cloned DNA by microinjection or 

30 microparticle bombardment; use of viral vectors; encapsulation within a carrier system; 
and use of transfecting reagents such as calcium phosphate, diethylaminoethyl (DEAE) 
-dextran or commercial transfection systems like the Ltpofect-AMINE 2000 (Invitrogen). 
Preferably, the transfection method used to introduce the purified DNA sequence into a 
eukaryotic host cell is the method for transfecting a eukaryotic cell as described below. 

35 
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The purified and isolated DNA sequence is used in the form of a circular vector. 
Preferably, the purified and isolated DNA sequence is in the form of a linear DNA 
sequence as vector. 

As used herein, "plasmid" and "vector" are used interchangeably, as the plasmid is the 
most commonly used vector form. However, the invention is intended to include such 
other forms of expression vectors, including, but not limited to, viral vectors (e. g., 
replication defective retroviruses, adenoviruses and adeno-associated viruses), which 
serve equivalent functions. 



The present invention further encompasses a method for transfecting a eukaryotic host 
cell, said method comprising 

a) introducing Into said eukaryotic host cell at least one purified DNA sequence 
comprising at least one DNA sequence of interest and/or at least one purified 

1 5 and isolated DNA sequence comprising a MAR nucleotide sequence or other 

chromatin modifying elements, 

b) subjecting within a defined time said transfected eukaryotic host cellto at least 
one additional transfection step with at least one purified DNA sequence 
comprising at least one DNA sequence of interest and/or with at least one 

20 purified and isolated DNA sequence comprising a MAR nucleotide sequence or 

other chromatin modifying elements 

c) selecting said transfected eukaryotic host cell. 



Preferably at least two up to four transfecting steps are applied in step b). 



In order to select the successful transfected cells, a gene that encodes a selectable 
marker (e. g., resistance to antibiotics) is generally introduced into the host cells along 
with the gene of interest. The gene that encodes a selectable marker might be located 
on the purified DNA sequence comprising at least one DNA sequence of interest and/or 
30 at least one purified and isolated DNA sequence consisting of a MAR nucleotide 

sequence or other chromatin modifying elements or might optionally be co-introduced in 
separate form e.g. on a plasmid. Various selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. The amount of the 
drug can be adapted as desired in order to increase productivity 
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Usually, one or more selectable markers are used. Preferably, the selectable markers 
used in each distinct transfection steps are different. This allows selecting the 
transformed cells that are "multi-transformed" by using for example two different 
antibiotic selections. 

5 

Any eukaryotlc host cell capable of protein production and lacking a cell wall can be 
used in the methods of the invention. Examples of useful mammalian host cell lines 
include human cells such as human embryonic kidney line (293 or 293 cells subcloned 
for growth in suspension culture, Graham et al., J. Gen Virol 36, 59 (1977)); human 

10 cervical carcinoma cells (HELA, ATCC CCL 2), human lung cells (W1 38, AT CC CCL 
75), human liver cells (Hep G2, HB 8065); rodent cells such as baby hamster kidney 
cells (BHK, ATCC CCL 10), Chinese hamster ovary cells/-DHFR (CHO, Urlaub and 
Chasin, Proc. Natl. Acad. Sci. USA, 77, 4216 (1980)), mouse Sertoli cells (TM4, Mather, 
Biol. Reprod 23, 243-251 (1980)), mouse mammary tumor (MMT 060562, ATCC 

15 CCL51); and cells from other mammals such as monkey kidney CV1 line transformed 
by SV40 (COS-7, ATCC CRL 1651); monkey kidney cells (CV1 ATCC CCL 70); African 
green monkey kidney cells (VERO-76, ATCC CRL-1587); canine kidney cells (MDCK, 
ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); myeloma, (e.g. NS0) 
/hybridoma cells. 

20 . Preferred for uses herein are mammalian cells, more preferred are CHO ceSls. 

The DNA sequence of interest of the purified and isolated DNA sequence is usually a 
gene of interest preferably encoding a protein operably linked to a promoter as 
described above. The purified and isolated DNA sequence comprising at least one DNA 
25 sequence of interest might comprise additionally to the DNA sequence of interest MAR 
nucleotide sequence or other chromatin modifying elements. 

t 

Purified and isolated DNA sequence comprising a MAR nucleotide sequence are for 
example selected from the group comprising the sequences SEQ ID Nos 1 to 23 and/or 

30 particular elements of the cLysMAR e.g. the B, K and F regions as well as fragment and 
elements and combinations thereof as described above. Other chromatin modifying 
elements are for example boundary elements (BEs). locus control regions (LCRs), and 
universal chromatin opening elements (UCOEs) (see Zahn-Zabal et al. already cited). 
An example of multiple transfections of host cells is shown in Example 12 (table 3). 

35 The first transfecting step (primary transfection) is carried out with the gene ;of interest 
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(SV40EGFP) alone, with a MAR nucleotide sequence (MAR) alone or with the gene of 
interest and a MAR nucleotide sequence (MAR-SV40EGFP). The second transfecting 
step (secondary transfection) is carried out with the gene of interest (SV40EGFP) 
alone, with a MAR nucleotide sequence (MAR) alone or with the gene of interest and a 
5 MAR nucleotide sequence (MAR-SV40EGFP), in all possible combinations resulting 
from the first transfecting step. 

Preferably the eukaryotic host cell is transfected by: 

a) introducing a purified DNA sequence comprising one DNA sequence of interest and 
10 additionally a MAR nucleotide sequence, 

b) subjecting within a defined time said transfected eukaryotic host cell to at least one 
additional transfection step with the same purified DNA sequence comprising one 
DNA sequence of Interest and additionally a MAR nucleotide sequence of step a). 

15 Surprisingly, a synergy between the first and second transfection has been observed A 
particular synergy has been observed when MAR elements are present at one or both 
of the transfection steps. Multiple transfections of the cells with pMAR alone or in 
combination with various expression plasmids, using the method described above have 
been earned out. For example, Table 3 shows that transfecting the cells twice with the 

20 PMAR-SV40EGFP plasmid gave the highest expression of GFP and the highest degree 
of enhancement of all conditions (4.3 fold). In contrast, transfecting twice the vector 
without MAR gave little or no enhancement, 2.8-fold, instead of the expected two-fold 
increase. This proves that the presence of MAR elements at each transfection step is of 
particular interest to achieve the maximal protein synthesis. 

25 As a particular example of the transfection method, said purified DNA sequence 

comprising at least one DNA sequence of interest can be introduced in form of multiple 
unlinked plasmids, comprising a gene of interest operably linked to a promoter a 
selectable marker gene, and/or protein production increasing elements such as MAR 
sequences. 

50 

The ratio of the first and subsequent DNA sequences may be adapted as required for 
the use of specific cell types, and is routine experimentation to one ordinary skilled in 
the art. 
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The defined time for additional transformations of the primary transformed -cells is 
tightly dependent on the cell cycle and on its duration. Usually the defined time 
corresponds to intervals related to the cell division cycle. 

Therefore this precise timing may be adapted as required for the use of specific cell 
5 types, and is routine experimentation to one ordinary sWIIed in the art 

Preferably the defined time is the moment the host cell Just has entered into the same 
phase of a second or a further cell division cycle, preferably the second cyfcte. 
This time is usually situated between 6h and 48 h, preferably between 20hiand 24h 

i 

after the previous transfecting event. 

10 

The invention further comprises a transgenic organism wherein its genome has stably 
integrated at least one DNA sequence according to the present invention dnd/orthat at 
least some of its cells have been transfected according to the above-described method. 
Transgenic animals eukaryotic organisms which can be useful for the present invention 
1 5 are for example selected form the group comprising mammals (mouse, human, monkey 
etc) and in particular laboratory animals such as rodents in general, insects (drosophila, 
etc), fishes (zebra fish, etc.), amphibians (frogs, newt, etc..) and other simpler 
organisms such as c. elegans, yeast, etc.... j 

20 Yet another object of the present invention Is to provide a purified and isolated DNA 
sequence identified according to method of the present invention, having nftatrix 
attachment region (MAR) activity i.e. capable of having protein production increasing 
activity. Preferred purified and isolated DNA sequence identified according|to present 
Invention comprise a sequence selected from the sequences SEQ ID Nos 1 to 23 or a 

25 cLysMAR element and/or fragment, a sequence complementary thereof, apart thereof 
sharing at least 70% nucleotides in length, a molecular chimera thereof, a fcombination 
thereof and variants. 

More preferably, the matrix attachment region (MAR) activity comprises a sequence 
30 selected from the sequences SEQ ID Nos 21 to 23, a sequence complementary 

thereof, a part thereof sharing at least 70% nucleotides in length, a molecular chimera 
thereof, a combination thereof and variants. J 

In the present invention, the cLysMAR element and/or fragment are consisting of at 
35 least one nucleotide sequence selected from the B, K and F regions. 
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A further object of the present invention is to provide a synthetic MAR sequence 
consisting of natural MAR element and/or fragments assembled belween linker 
sequences. 

5 

Preferably, the synthetic MAR sequence is a cLysMAR and linker Sequences are Bgll- 
BamHI linker. 

The present invention also provides for a cell transfection mixture or Kit comprising at 
1 0 least one purified and isolated DNA sequence according to the invention. 

Also envisioned is a process for the production of a protein wherein a eukaryotic host 
cell is transfected according to the process as defined in the present invention and is 
cultured in a culture medium under conditions suitable for expression of the protein. 
1 5 Said protein is finally recovered according to any recovering process known to the 
skilled in the art. 



20 



Given as an example, the following process for protein production might be used. 
The eukaryotic host cell transfected with the transfection method of the present 
invention is used in a process for the production of a protein by culturing said cell under 
conditions suitable for expression of said protein and recovering said protein. Suitable 
culture conditions are those conventionally used for in vitro cultivation of eukaryotic 
cells as described e.g. in WO 9 6/3g488. The protein can be isolated from the cell 
culture by conventional separation techniques such as e.g. fractionation oh 
25 immunoaffinity or ion-exchange columns; precipitation: reverse phase HPLC; 

chromatography; chromatofocusing; SDS-PAGE; gel filtration One skilled in the art will 
appreciate that purification methods suitable for the polypeptide of interest may require 
modrfication to account for changes in the character of the polypeptide upbn expression 
in recombinant cell culture. 



30 



The proteins that are produced according to this invention can be tested for 
functionality by a variety of methods. For example, the presence of antigenic epitopes 
and ability of the proteins to bind ligands can be determined by Western blot assays 
fluorescence cell sorting assays, immunoprecipitation, Immunochemical assays and/or 
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competitive binding assays, as well as any other assay which measures specific binding 
activity. 

The proteins of this invention can be used in a number of practical applications 
5 including, but not limited to: 

1. Immunization with recombinant host protein antigen as a viral/pathogen antagonist. 

2. Production of membrane proteins for diagnostic or screening assays. 

3. Production of membrane proteins for biochemical studies. 

4. Production of membrane protein for structural studies. 

1 0 5. Antigen production for generation of antibodies for immuno-histdchemical mapping, 
including mapping of orphan receptors and ion channels. 

The foregoing description will be more fully understood with reference to the following 
Examples. Such Examples, are, however, exemplary of methods of practising the 
1 5 present invention and are not intended to limit the scope of the invention. 
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Examples 

Example 1: SMAR Scan and MAR aeguences 

5 

A first rough ©valuation of SMAR Scan was done by analyzing experimentally defined 
human MARs and non-MAR sequences. As MAR sequences, the previous Iresults from 
the analysis of human MARs from SMARt Db were used to plot a density histogram for 
each criterion as shown in Fig. 1. Similarly, non-MAR sequences 1 were also! analyzed 
10 and plotted. As non-MAR sequences, all Ref-Seq-contigs from the chromosome 22 
were used, considering that this latter was big enough to contain.'d negtigibie part of 
MAR sequences regarding the part of non-MAR sequences. ; : 

i 

The density distributions shown in Fig. 1 are all skewed with a long tail. Foil the highest 
1 5 bend, the highest major groove depth and the highest minor groovte width, the 

distributions are right skewed. For the lowest melting temperature = the distributions are 
left-skewed which is natural given the inverse correspondence of *,is criterion regarding 
the three others. For the MAR sequences, biphasic distributions jwkh a second weak 
peak, are actually apparent. And between MAR and non-MAR sequences, distributions, 
20 a clear shift is also visible in each plot. 

Among all human MAR sequences used, in average only about i6% of them have a 
value greater than the 75th quantile of human MARs distribution,; this for the four 
different criteria. Similarly concerning the second weak peak of each human MARs 
25 distribution, only 15% of the human MAR sequences are responsible of mese outlying 
values. Among these 1 5% of human MAR sequences, most are very well documented 
MARs, used to insulate transgene from position effects, such as kne interferon locus 
MAR, the beta-globln locus MAR (Ramezani A, Hawley TS, Hawiey RQ r Performance- 



and safety-enhanced lentiviral vectors containing the human interferori-beta scaffold 
attachment region and the chicken beta-globin insulator. Stood, 101:4?' #-4724, io3) or the 
apolipoprotein MAR (Namciu, S, Blochinger KB, Fournier REK, Human matix ' 
attachment regions in-sulate transgene expression from chromosomal posfcon effects 
in Drosophila melanogaster, Mol. Cell. Biol.. 18:2382-2391, 1998); j 
Always with the same data, human MAR sequences were also used to determine the 
association between the four theoretical structural properties confuted and the AT- 
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content! Fig. 2 represents the scatterplot and the corresponding correlation coefficient r 
for every pair of criteria. 

Example 2:lDistfiibution plots of MAR sequences by organism 

MAR sequences |ftom SMARt DB of other organisms were also retrieved and analyzed 
similarly, as explained previously. The MAR sequences density distributions for the 
mouse, the chicken, the sorghum bicolor and the human are plotted jointly in Fig. 3. 



iO Example 3:MARj prediction of the whole chromosome 22 

All RefSeq contigsSfrom the chromosome 22 were analyzed by SMAR Scan using the 
default sjettiipgs tthis time. The result is that SMAR Scan predicted a total of 803 MARs, 
their average length being 446 bp, which means an average of one MAR predicted per 

15 42 777 bp- The tcjrtal length of the predicted MARs corresponds to 1 % of the 

chromosome 22 jength. The AT-content of the predicted regions ranged from 65,1 % to 
93.3%; the average AT-content of all these regions being 73.5%. Thus, predicted MARs 
were AT r ricih, whefeas chromosome 22 is not AT-rich (52.1 % AT). 

20 SMARTest was aH?o used to analyze the whole chromosome 22 and obtained 1387 
MAR candidates jtpeir average length being 494 bp representing an average of one 
MAR predicted p^ 24 765 bp. The total length of the predicted MARs corresponds to 
2% of the chromosome 22. Between all MARs predicted by the two softwares, 154 
predicted MARs are found by both programs, which represents respectively 19% and 

25 1 1 % of $MAR Sqan and SMARTest predicted MARs. Given predicted MARs mean 
length fcjr SMAR Scan and SMARTest, the probability to have by chance an 
overlapping betwpf n SMAR Scan and SMARTest predictions is 0.0027% per 

' ' i 

prediction. ' ; 

': : i 

i • ! ! 

30 To evaluate the specificity of SMAR Scan predictions, SMAR Scan analyses were 
performed on raiidjomly shuffled sequences of the chromosome 22 (Fig.4). Shuffled 
sequences were generated by three ways, by a segmentation of the chromosome 22 
into nonioverlappjijtg windows of 10 bp and by separately shuffling the nucleotides in 
each wiridow. by Scrambling" which means a permutation of all nucleotides of the 

35 chromosome and lj>y "rubbling" which means a segmentation of the chromosome in 
fragments of 10 bpj and then a random assembling of these fragments. The first 
shuffling; method preserves the local nucleotide composition whereas the two other 

# 
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methods destroy local information preserving only the global nucleotide composition. 
For each shuffling method, five shuffled chromosome 22 were generated ajnd analyzed 
by SMAR Scan using the default settings. Concerning the number nils, an average of 3 
519 170 hits (sd: 18 353) was found for the permutated chromos.omej22 within non- 
5 overlapping windows of 1 0 bp, 171 936,4 hits (sd: 2 859,04 ) forthe scrambled 
sequences and 24 708,2 hits (sd: 1 191,59) forthe rubbled chromosome 22. which 
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respectively represents 185% (sd: 1%), 9% (sd: 0.15%) and 1% (sd: 
number of hits found with the native chromosome 22. For the numbel of MARs 
predicted, which thus means contiguous hits of length greater than 300, 1*997 MARs 
were predicted with the shuffled chromosome 22 within windows of iS) bp (sd: 31.2), 
only 2.4 MARs candidates were found in scrambled sequences fjsd: |.96):and none for 
the rubbled sequence, which respectively represents 249% and less |than. 0.3% of the 
number of predicted MARs found with the native chromosome 22. 



15 Example 4:Accuracv of SMAR Scan prediction an d 
predictive tools " 



comparison 




dther 



The accuracy of SMAR Scan was evaluated using six genomic sequences for which 
experimentally determined MARs have been mapped. In order to pelorrn a comparison 
with other predictive tools, the sequences analyzed are the same wijf the sequences 
previously used to compare MAR-Finderand SMARTest These : g«nlmic sequences 
are three plant and three human sequences (Table 1) totalizing 310 S-51 bp and 37 
experimentally defined MARs. The results for SMARTest and MAR-4ider in Table 1 
come from a previous comparison (Frisch M, Freeh K, Klingenhdff A icartriarius K, 
Liebich I and Werner T, In silico pre-diction of scaffold/matrix attachment regions in 
large genomic sequences, Genome Research, 12:349-354, 2001.). |j 
MAR-Finder has been used with the default parameters excepted fojthe threshold 
that has been set to 0.4 and forthe analysis of the protamine locus, iieUirichness 
rule has been excluded (to detect the non AT-rich MARs as was=donl for the protamine 
locus). ! 

i ] 
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Sequence, description 
and reference 



Length 



Oryza Sativa putative 
ADP-glucose pyro- 
phosphorytase subunit 
SH2 and putative 
NADPW dependent 
reductase A1 genes 
(U70541). 14] 



"30.054 



Experiment" 
ally defined 
MARs 
positions 

(to) 



SKIKRTeSt 
prediction 
positions 



(Kb) 



MAW-mdS 
prediction 
positions 



(kb) 




iredjctjdn 
jiwsltlonte 



Sorghum bicoior ADK- 
glucose pyrophopho- 
rylase subunit5H2. 
NADPH-depandant 
reducatse A1-b genes 
(AFG10283).[4] 



42.44S 



Sorghum tn color 
clone 11 GK5 
(AF124045).[37] 



0.0-1.2 
5.4-7.4 



17.3-15.5 
20.0-23.1 



6.5-7.0 
15.2-15.7 
162-16.6 
17.6-18.3 

19.6- 20.1 

20.7- 21,3 
23.6-23.9 
25.0-25.4 
27.5-27.9 



15.7- 15.9 

17.5.13.4 

19.8- 20.4 
21.3-21.5 

23.9- 24.2 
24.7-25.1 



0.0-1.5 
7.1-9.7 

22.4-24.7 



32.5- 33.7 

41.6- *2.3 



21.3-21.9 
22.9-24.0 

27.3-27.6 



23.2-24.2 
26.9-27.5 



Human alpha-l-antitry- 
srn and corticosteroid 
tending globulin 
intergenic region 
(AF15654o).l35] 



30.461 



Human protamine locus 
(U15422). [24] 



Human beta-globin 
locus 

(U01317). [21] 



"753SS 



•vU3 

.-5.3 

-.,6.3 

—9.3 

-v15.0 

♦s.18.5 

^21.9 

...23.3 

-s-25.6 

-.29.1 

-.34.6 

-.44.1 
-.•46.5 

.-.57.9 
-^62.9 
-67.1 
-*69.3 
•^73.7 



15.1-15.6 
21.7-22.0 



44.1-443 
47.9-49.5 



63.1-63.7 
74.3-74.7 



il5.6-td 
i - ' 
17.6-182 
i21.**22 



3.4-23i& 

m - j , 



fj7.4-7.7 
SH .5-21 28 
K2 .4.232 
33.G.24J0 
£7.3-27:6 
S3.4-33J9 



14-21 i9 



47.9-49.4 



2.6-6.3 
22.0-30.4 



5.5-6.0 

25.7-26.2 
27.5-27.8 



3.0- 3.2 

5.1- 6.0 
24.9-25.3 
25.5-25.8 
26.2-26.4 
27.5-28.2 



8A-9.T 
32.6-33.6 
37.2-39.4 
51.8-53.0 

1.5-3.U 
15.6-19.0 



33.9-34.8* 
33.9-34.8* 



44.7-52.7 
60.0-70.0 



18.0-18.4 

34.4-34.9 

55.6-57.1 
59.8-60.3 
65.6-66.0 



15.5- 16.0 

18.0- 184 

50.6- 50.8 
56.5-57.2 

58.1- 58.5 
63.0-63.6 



5^26.4 



[5.3^15.6 

■i - ; 
ii : • 

&.8i63.1 
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0 



5 



0 



5 



Sum(kb) 



Total numbers : ~ 
Average kb /predicted 
MAR 

True positives [number 
of experimentally 
defined MAR found] 
False positives 
False negatives 
Specificity 
Sensitivitv 



310.151 



at least 35.1 



67.6-675 
68.8.69.1 



14.5 



28 
11.076 

19114] 



9 

23 

19:28= 68% 
14«7= 38% 



68.7-69 J3 



S. 



25 . 
12.4G8! 

2qi2]i 



25 ! 
20/25= Bfi%. 
12^37=82% 



86.^66.7 



9.5 



22 
14.097 

17[14) 



17/22 



1U/37- 



5 

23 

77% 

= 38% 



Table 1: Evaluation of SMAR Scan accuracy! 



Six different genomic sequences, three plant and three human sequences, for which 
experimentally defined MARs are known, were analyzed with MARj-Fijider, SMARTest 
and SMAR Scan. True positive matches are printed in bold, minus* (-indicates false 
negative matches. Some of the longer experimentally defined MARs coritaihed more 
than one in silico prediction, each of them was counted as true positive match. 
Therefore, the number of true in silico predictions is higher than thfe Jlmber of 
experimentally defined MARs found. Specificity is defined as the rdrtiJlof true positive 
predictions, whereas sensitivity is defined as the ratio of experimentally defined MARs 
found. * AT-rich rule excluded using MAR-Finder. ! 

i 
j 

SMARTest predicted 28 regions as MARs, 19 (true positives) of thesl correlate with 
experimentally defined MARs (specificity: 68%) whereas 9 (32%)-.aiejiocatedin non- 
MARs (false positives). As some of the longest experimentally determined 
MARs contains more than one in silico prediction, the 19 true poslitives correspond 
actually to 14 different experimentally defined MARs (sensitivity: 3$%ji MARFinder 
predicted 25 regions as MARs, 20 (specificity: 80%) of these correlate! w th 
experimentally defined MARs corresponding to 12 different experirjie|tally defined 
MARs (sensitivity: 32%). SMAR Scan predicted 22 regions. 17 beSrig feue positives 
(specificity: 77%) matching 14 different experimentally defined MAf?s|sdnsitivity: 38%). 

mosomes 1 
1 to 23). These 



As another example, the same analysis has been applied to humaii d&ro 
and 2 and lead to the determination of 23 MARs sequences (SEQ Id 
sequences are listed in Annex 1 in ST25 format. ' 
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Example 5:Pissection of the chicken Ivsozvme gene 5'- MAR ; 

The 3000 base pair 5 -MAR was dissected Into smaller fragmentsjthit vlrere monitored 
for effect on transgene expression in Chinese hamster ovary (CH<b) tel|s. To do so r 
seven fragments of -400 bp were generated by polymerase chain reaction (PGR). 
These PCR-amplified fragments were contiguous and cover the elitiife I^IAR sequence 
when placed end-to-end. Four copies of each of these fragments W e p legated in a 
head-to-tail orientation, to obtain a length corresponding to apprQ*ini|rtdly half of that of 
the natural MAR. The tetramers were inserted upstream of the S\l40 iprbmoter in 
pGEGFPControl, a modified version of the pGL3Control vector (Pbrfte^a). The plasmid 
pGEGFPControl was created by exchanging the luciferase gene df Jbd3Control for the 
EGFP gene from pEGFP-N1 (Clontech). The S-MAR-fragment-ddntlinikg plasmids 
thus created were co-transfected with the resistance plasmid pSVhel irj CHO-DG44 
cells using LipofectAmine 2000 (Invitrogen) as transfection reagfeijit, as performed 
previously (Zahn-Zabal, M., et al., "Development of stable cell liroeis flsr (production or 
regulated expression using matrix attachment regions" J Biotechftolf : 260'i . 87(1): p. 

i T j 

29-42.). After selection of the antibiotic (G-418) resistant cells, polyclonal cell 
populations were analyzed by FACS for EGFP fluorescence. ! 

Transgene expression was expressed at the percentile of high eX^ressor cells, defined 
as the cells which fluorescence levels are at least 4 orders of magnitude higher than the 
average fluorescence of cells transfected with the pGEGFPContfcil vletpr without MAR. 
Fig. 5 shows that multimerized fragments B, K and F enhance tr&ihsg,fenfe expression, 
despite their shorter size as compared to the original MAR sequericejt. Ir contrast, other 

i. 

fragments are poorly active or fully inactive. ! 




Example 6:Specificitv of B, K and F regions in the MAR confekt 

The 5'-MAR was serially deleted from the 5 -end (Fig.6, upper pari) car the 3'-end (Fig.6, 
lower part), respectively. The effect of the truncated elements wad mpnftored in an 
assay similar to that described in the previous section. Figure 6 sHoJfe that the loss of 
ability to stimulate transgene expression in CHO cells was not etfenl^dikributed. 



In this deletion study, the loss of MAR activity coincided with discreteire^ons of 
35 transition which overlap with the 5'-MAR K- and F-fragment fiefepfectrvely. In 5' 
deletions, activity was mostly lost when fragment K and F were reilno ted. 3' deletions 
that removed the F and b elements had the most pronounced effects; Irl contrast, 
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flanking regions A, D, E and G that have little or no ability to stimulate transgene 
expression on their own (Fig. 5), correspondingly did not contribute to the MAR activity 
in the 5 - and 3'-end deletion studies (Fig. 6). 

5 Example 7: Structure of the F element 

The 465 bp F fragment was further dissected into smaller sub-fragments of 234, 243, 
213 bp and 122, 125 and 121 bp, respectively. Fragments of the former group were 
octamerized (8 copies) in a head-to-tail orientation, while those of the latter group were 

1 0 similarly hexa-decamerized (1 6 copies), to maintain a constant length of MAR 

sequence. These elements were cloned In pGEGFPControl vector and their effects 
were assayed in CHO cells as described previously. Interestingly, fragment Fill retained 
most of the activity of the full-length F fragment whereas fragment Fll, which contains 
the right-hand side part of fragment Fill, lost all the ability to stimulate transgene 

15 expression (Fig. 7). This points to an active region comprised between nt 132 and nt 
221 in the FIB fragment. Consistently, multiple copies of fragments Fl and FIB, which 
encompass this region, displayed similar activity. FIIA on its own has no activity. 
However, when added to FIB, resulting in Fill, it enhances the activity of the former. 
Therefore FIIA appears to contain an auxiliary sequence that has little activity on its 

20 own, but that strengthens the activity of the minimal domain located In FIB. 

Analysis of the distribution of individual motifs within the lysozyme gene 5'-MAR is 
shown in Fig. 8 A, along with some additional motifs that we added to the analysis. 
Most of these motifs were found to be dispersed throughout the MAR element, and not 

25 specifically associated with the active portions. For instance, the binding sites of 

transcription factors and other motifs that have been associated with MARs were not 
. preferentially localized in the active regions. It has also been proposed that active MAR 
sequences may consist of combination of distinct motifs. Several computer programs 
(MAR Finder, MARscan, SMARTest, SIDD duplex stability) have been reported to 

30 identify MARs as regions of DNA that associate with the DNA matrix. They are usually 
based on algorithms that utilizes a predefined series of sequence-specific patterns that 
have previously been suggested as containing MAR activity, as exemplified by MAR 
Finder, now known as MAR Wiz. The output of these programs did not correlate well 
with the transcriptionally active portions of the cLysMAR. For instance, peaks of activity 

35 obtained with MAR Finder did not clearly match active MAR sub-portion, as for instance 
the B fragment is quite active in vivo but scores negative with MAR Finder (Fig. 8B, 

31 
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compare the top and middle panels). Bent DNA structures, as predicted by this 
program, did not correlate well either with activity (Fig, 8B, compare the top and bottom 
panels). Similar results were obtained with the other available programs (data not 
shown). 

5 

The motifs identified by available MAR prediction computer methods are therefore 
unlikely to be the main determinants of the ability of the cLysMAR to increase gene 
expression. Therefore, a number of other computer tools were tested-. Surprisingly, 
predicted nucleosome binding sequences and nucleosome disfavouring sequences 
10 were found to be arranged in repetitively interspersed clusters over the MAR, with the 
nucleosome favouring sites overlapping the active B, K and F regions. Nucleosome 
positioning sequences were proposed to consist of DNA stretches that can easily wrap 
around the nucleosomal histones, and they had not been previously associated with 
MAR sequences. 

15 

Nucleosome-favouring sequences may be modelled by a collection of DNA features 
that include moderately repeated sequences and other physico-chemical parameters 
that may allow the correct phasing and orientation of the DNA over the curved histone 
surface. Identification of many of these DNA properties may be computerized, and up 
20 to 38 different such properties have been used to predict potential nucleosome 

positions. Therefore, we set up to determine if specific components of nucleosome 
prediction programs might correlate with MAR activity, with the objective to construct a 
tool allowing the identification of novel and possibly more potent MARs from genomic 
sequences. 

25 

To determine whether any aspects of DNA primary sequence might distinguish the 
active B t K and F regions from the surrounding MAR sequence, we analyzed the 5'- 
MAR with MARScan. Of the 38 nucleosomal array prediction tools, three were found to 
correlate with the location of the active MAR sub-domains (Fig. 9A). Location of the 

30 MAR B, K and F regions coincides with maxima for DNA bending, major groove depth 
and minor groove width. A weaker correlation was also noted with minima of the DNA 
melting temperature, as determined by the GC content Refined mapping over the MAR 
F fragment indicated that the melting temperature valley and DNA bending summit 
indeed correspond the FIB sub-fragment that contains the MAR minimal domain (Fig. 

35 9B). Thus active MAR portions may correspond to regions predicted as curved DNA 
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regions by this program, and we will refer to these regions as CUE-B, CUE-K and CUE- 
F in the text below. Nevertheless, whether these regions correspond to actual bent DNA 
and base-pair unwinding regions is unknown, as they do not correspond to bent DNA 
as predicted by MAR Wiz (Fig.9B). ■ 
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Example 8:lmprints of other regulatory elements in the F fragment 

Nucleosome positioning features may be considered as one of th4 many specific 
chromatin codes contained in genomic DNA. Although this particujar code may 
contribute to the activity of the F region, it is unlikely to determine MAR activity alone, 
as the 3' part of the F region enhanced activity of the minimal MAk domain contained in 
the FIB portion. Using the Matlnspector program (Genornatix), wejsearched for 
transcription factor binding sites with scores higher than 0.92 and found DNA binding 
sequences for the NMP4 and MEF2 proteins in.the 3' part of the f| fragment (Fig. 8B). 
To determine whether any of these transcription factor-binding sites might localize close 
to the B and K active regions, the entire 5-MAR sequence was analyzed for binding by 
NMP4 and MEF2 and proteins reported to bind to single-stranded jar double-stranded 
form of BURs. Among those, SATB1 (special AT-rich binding protein 1> belongs to a 
class of DNA-bindlng transcription factor that can either activate o{ repress the 

* 

expression of nearby genes. This study indicated that specific profeins such as SATB1 , 
NMP4 (nuclear matrix protein 4) and MEF2 (myogenic enhancer factor 2), have a 
specific distribution and form a framework around the minimal MAk domains of 
cLysMAR (Fig. 10). The occurrence of several of these NMP4 andSATBI binding sites 
has been confirmed experimentally by the EMSA analysis of purified recombinant 
proteins (data not shown). 

Example 9:Construction of artificial MARs bv combining defined genetic 



elements 

To further assess the relative roles of the various MAR components 
deleted of all three CUE regions (Fig. 11, middle part), which resuled 
of its activity when compared to the complete MAR sequence similarly 
all of its components as a control (Fig. 1 1, top part). Consistently, 
CUE alone, or one copy of each of the three CUEs assembled hes 
activity in the absence of the flanking sequences. These results stiengthen 
conclusion that optimal transcriptional activity requires the combiruition 

33 



, the cLysMAR was 
in the loss of part 
assembled from 
6ne copy of each 
d-to-tail, had little 
the 

of CUES with Of 



!4.-ncyn9/onn/i 17-iq 



PmTk«f 



06/02/2004 18:22 



+41-61-225-97-99 



COGIT 



S. 



01 



10 



15 



20 



25 



30 



i 

flanking sequences. Interestingly, the complete MAR sequence ginerated from each of 



its components, but containing also Bglll-BamHI linker sequences 



assemble each DNA fragment, displayed high transcriptional activity (6 fold activation) 



(AGATCC) used to 



this series of assays 



as compared to the 4.8 fold noted for the original MAR element in 
(see Fig. 5). 

We next investigated whether the potentially curved DNA regions bnay also be active in 
an environment different from that found in their natural MAR cort ext. Therefore, we set 
up to swap the CUE-F, CUE-B and CUE-K elements, keeping the flanking sequences 
unchanged. The sequences flanking the CUE-F element were amplified by PCR and 
assembled to bracket the various CUEs, keeping their original orientation and distance, 
or without a CUE. These engineered ~1.8 kb MARs were then assayed for their ability 
to enhance transgene expression as above. All three CUE were airtive in this context, 
and therefore there action is not restricted to one given set of flanking sequences. 
Interestingly, the CUE-K element was even more active than CUE-F when inserted 
between the CUE-F flanking sequences, and the former compositb construct exhibited 
an activity as high as that observed for the complete natural MAR (4.8 fold activation). . 
What distinguishes the CUE-K element from CUE-F and CUE-B is the presence of 
overlapping binding sites for the MEF-2 and SatB1 proteins, in addition to its CUE 
feature. Therefore, fusing CUE-B with CUE-F-flanking domain results in a higher 
density of all three binding sites, which is likely explanation to the increased activity. 
These results indicate that assemblies of CUEs with sequences containing binding sites 
for proteins such as NMP4, MEF-2 t SatB1 f and/or polyPpolyQ proteins constitute 
potent artificial MAR sequences. 

Example 10:Expression vectors 

Three expression vectors according to the present invention are represented on Figure 
12. 

Plasmid pPAGOl is a 5640 bp pUC1 9 derivative. It contains a 2960 bp chicken DNA 
fragment cloned in BamH1 and Xbal restriction sites. The insert comes from the border 
of the 5*-end of the chicken tyzozyrne locus and has a high A/T-content. 



Plasmid pGEGFP (also named pSV40EGFP) control is a derivative of the pGL3- 
control vector (Promega) in which the luciferase gene sequence has been replaced by 
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the EGFP gene sequence form the pEGFP-N1 vector (Clontech). The size of pGEGFP 
plasmid is 4334bp. 

Plasmid pUbCEGFP control is a derivative of the pGL3 wit an Ubiquitin promoter 

5 

Plasmid pPAGOIGFP (also named pMAR-SV40EGFP) Is a derivative of pGEGFP with 
the 5'-Lys MAR element cloned in the MCS located just upstream of the SV40 
promoter. The size of the pPAGOIEGF plasmid is 7285bp. 

10 E * am P> 6 1 1 sEffect of the additional transfection of p r imary tranafectant cells o n 
transaene expression ■ — 

One day before transfection, cells were plated in a 24-well plate, in growth medium at a 
density of 1.35 x 10 s cells/well for CHO-DG44 cells. 16 hours post-inoculum, cells were 

1 5 transfected when they reached 30-40% confluence, using Lipofect- AMINE 2000 

(hereinafter LF2000), according to the manufacturer's instructions (Invitrogen). Twenty- 
seven microliters of serum free medium (Opti-MEM; Invitrogen) containing 1.4 pi of 
LF2000 were mixed with 27 pi of Opti-MEM containing 830 ng of linear plasmid DMA. 
The antibiotic selection plasmid (pSVneo) amounted to one tenth of the reporter 

20 plasmid bearing the GFP transgene. The mix was incubated at room temperature for 20 
min. to allow the DNA-LF2000 complexes to form. The mixture was djluted with 300 pi 
of Opti-MEM and poured into previously emptied cell-containing wells. Following 3 
hours incubation of the cells with the DNA mix at 37°C in a C0 2 incubator, one ml of 
DMEM-based medium was added to each well. The cells were further incubated for 24 

25 hours in a CO z incubator at 37"C. The cells were then transfected a second time 

according to the method described above, except that the resistance plasmid carried 
another resistance gene (pSVpuro). Twenty-four hours after the second transfection 
cells were passaged and expanded into a T-75 flask containing selection medium 
supplemented with 500 ug/ml G-418 and 5 pg/ml puromycin. After a two week selection 

30 penod. stably transfected ceils were cultured »n 6-well plates. Alternatively, the cell 

population was transfected again using the same method, but pTKhygro (Clontech) and 
pSVdhfr as resistance plasmids. The expression of GFP was analysed with 
Fluorescence-activated cell sorter (FACS) and with a Fluoroscan. 

35 Fig.13 shows that the phenotype of the twice-transfected cells (hereafter called 

secondary transfectants) not only was strongly coloured, such that special bulb and 
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filter were not required to visualize the green color from the GFP protein, but also 
contained a majority of producing cells (bottom right-hand side FACS histogram) as 
compared to the parental population (central histogram). This level of fluorescence 
corresponds to specific cellular productivities of at least 10 pg per cell per day. Indeed, 
cells transfected only one time (primary transfectants) that did not express the maricer 
protein were almost totally absent from the cell population after re-transfection. Bars 
below 10 1 units of GFP fluorescence amounted 30% in the central histogram and less 
than 5% in the right histogram. This suggested that additional cells had been 
transfected and successfully expressed GFP. 

Strikingly, the amount of fluorescence exhibited by re-transfected cells suggested that 
the subpopulation of cells having incorporated DNA twice expressed much more GFP 
than the expected two-fold Increase. Indeed, the results shown in Table 2 indicate that 
the secondary transfectants exhibited, on average, more than the two-fold increase of 
GFP expected if two sets of sequences, one at each successive transfection, would 
have been Integrated independently and with similar efficiencies. Interestingly, this was 
not dependent on the promoter sequence driving the reporter gene as both viral and 
cellular promoter-containing vectors gave a similar GFP enhancement (compare lane 1 
and 2). However, the effect was particularly marked for the MAR-containing vector as 
compared to plasmids without MAR- (lane 3), where the two consecutive transfections 
resulted in a 5.3 and 4.6 fold increase in expression, in two distinct experiments. 



25 



Type of plasmids 


Primary 
. transfection 


Secondary 
transfection 

14'334 


EGFP 
fluorescence Fold 
increase 
2.8 


pUbCEGFP 
pSV40EGFP 


4'992 
4*324 


12'237 


2.8 


oMAR-SV40EGFP 


6'996 


36748 


5.3 




Type of plasmids 


Primary 
transfection 


Secondary 
transfection 


EGFP 
fluorescence Fold 
increase 


"pUbCEGFP 


6*452 


15794 


2.5 


pSV40EGFP 


4'433 


H'735 


2.6 


DMAR-SV40EGFP 


I 8'116 


37'475 


4.6 



Table 2. Effect of re-transfecting primary transfectants at 24 hours interval on GFP 
expression. Two independent experiments are shown. The resistance plasmid 
pSVneo was co-transfected with various GFP expression vectors. One day post- 
transfection, cells were re-transfected with the same plasmids with the difference 
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that the resistance plasmid was changed for pSVpuro. Cells carrying both resistance 
genes were selected on 500 ug/ml G^18 and 5ug/ml puromycin and the expression 
of the reporter gene marker was quantified by Fluoroscan. The fold increases 
correspond to the ratio of fluorescence obtained from two consecutive transfections 
5 as compared to the sum of fluorescence obtained from the corresponding 

independent transfections. The fold increases that were judged significantly higher 
are shown in bold, and correspond to fluorescence values that are consistently over 
2-fold higher than the addition of those obtained from the independent transfections. 

1 0 The increase in the level of GFP expression in multiply tranfected cells was not 

expected from current knowledge, and this effect had not been observed previously. 

Taken together, the data presented here support the idea that the plasmid sequences 
that primarily integrated into the host genome would facilitate integration of other 
plasmids by homologous recombination with the second incoming set of plasmid 
molecules. Plasmid recombination events occur within a 1-h interval after the plasmid 
DNA has reached the nucleus and the frequency of homologous recombination 
between co-injected plasmid molecules in cultured mammalian cells has been shown to 
be extremely high, approaching unity (Folger, K.R., K. Thomas, and M.R. Capecchi, 
Nonreciprocal exchanges of information between DNA duplexes coinjected into 
mammalian cell nuclei. Mol Cell Biol, 1985. 5(1): p. 59-69], explaining the integration of 
multiple plasmid copies. However, homologous recombination between newly 
introduced DNA and its chromosomal homolog normally occurs very rarely, at a 
frequency of 1 in 1 0 3 cells receiving DNA to the most I Thomas, K.R., K.R. Folger, and 
25 M.R. Capecchi, High frequency targeting of genes to specific sites in the mammalian 
genome. Cell, 1 986. 44(3): p. 419-28.J. Thus, the results might indicate that She MAR 
element surprisingly acts to promote such recombination events. MARs would not only 
modify the organization of genes in vivo, and possibly also allow DNA replication in 
conjunction with viral DNA sequences, but they may also act as DNA recombination 
30 signals. 

gam plel^ m.diat. th* unexpectedly hi„h levels of gx^ressjon in 

If MAR-driven recombination events were to occur in the multiple transfections process 
we expect that the synergy between the primary and secondary plasmid DNA would be' 
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affected by the presence of MAR elements at one or both of the transfection steps. We 
examined this possibility by multiply transfectlons of the cells with pMAR alone or in 
combination with various expression plasmids, using the method described previously. 
Table 3 shows that transfecting the cells twice with the pMAR-SV40EGFP plasmid gave 

5 the highest expression of GFP and the highest degree of enhancement of all conditions 
(4.3 fold). In contrast, transfecting twice the vector without MAR gave little or no 
enhancement, 2.8-fold, instead of the expected two-fold increase. We Conclude that the 
presence of MAR elements at each transfection step is necessary to achieve the 
maximal protein synthesis. 

10 Table 3 



Primary transfectlon 


Secondary transfection 


Type of plasmid 


EGFP- 
fluorescence 


Type of plasmid 


EGFP- 
fluorescence 


Fold 

increase 


pMAR 


0 


pMAR 

pSV40EGFP 

pMAR- 

SV40EGFP 


0 

15'437 
30'488 


0 

2.3-2.5 
2.6-2.7 


pMAR- 
SV40EGFP 


11*278 


pMAR- 

SV40EGFP 

pMAR 


47'027 
12'31"9 


4.3-5.3 
1.0-1.1 


pSV40EGFP 


6'114 


pSV40EGFP 
pMAR 


17'200 
1T169 


2.8 
1.8-2.3 



Interestingly, when cells were first transfected with pMAR alone, and then re- 
transfected with pSV40EGFP or pMAR-SV40EGFP, the GFP levels were more than 

1 5 doubled as compared to. those resulting from the single transfectlon of tfie later 

plasmids (2.5 and 2.7 fold respectively, instead of the expected 1-fold). This Indicates 
that the prior transfection of the MAR can increase the expression of the plasmid used 
in the second transfection procedure. Because MARs act only locally on chromatin 
structure and gene expression, this implies that the two types of DNA may have 

20 integrated at a similar chromosomal locus. In contrast, transfecting the GFP expression 
vectors alone, followed by the MAR element In the second step, yielded little or no 
improvement of the GFP levels. This indicates that the order of plasmid 
transfection is important, and that the first transfection event should contain a MAR 
element to allow significantly higher levels of transgene expression. 

25 

If MAR elements favoured the homologous recombination of the plasmids remaining in 
episomal forms from the first and second transfection procedures, followed by their co- 

38 
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integration at one chromosomal locus, one would expect that the order of plasmid 
transfection would not affect GFP levels. However, the above findings Indicate that it is 
more favourable to transfect the MAR element In the first rather than, in the second 
transfection event. This suggests the following molecular mechanism: during the first 
5 transfection procedure, the MAR elements may concatemerize and integrate, at least in 
part, in the cellular chromosome. This integrated MAR DNA may in turn favour the 
further integration of more plasmids, during the second transfection procedure, at the 
same or at a nearby chromosomal locus. 

10 Example 13;MARs as long term DNA transfer facilitators 

If integrated MARs mediated a persistent recombination-permissive chromosomal 
structure, one would expect high levels of expression even if the second transfection 
was performed long after the first one, at a time when most of the transiently introduced 
15 episomal DNA has been eliminated. To address this possibility, the cells from Table 3, 
selected for antibiotic resistance for three weeks, were transfected again once or twice 
and selected for the incorporation of additional DNA resistance markers. The tertiary, or 
the tertiary and quaternary transfection cycles, were performed with combinations of 
pMAR or pMAR-SV40EGFP, and analyzed for GFP expression as before. 

20 



Table 4 



Tertiary transfection 


Quaternary transfection 


lype of plasmid 


hGFP- 
fluorescence 


Fold 
increase 


Type of plasmid 


EGFP- 
fluoFescence 


Fold 
increas 
e 


pMAR 


18368 


2.2 


pMAR 

pMAR- 

SV40EGFP 


43'186 
140'000 


2.4 
7.8 


pMAR-SV40EGF 


18544 


2.0 


pMAR- 

SV40EGFP 

pMAR 


. 91*000 
313*814 


5.5 
2.0 



25 Table 4. MARs act as facilitator of DNA integration. The pMAR-SV40EGFP/ pMAR- 
SV40EGFP secondary transfectants were used in a third cycle of transfection at the 
end of the selection process. The tertiary transfection was accomplished with pMAR or 
pMAR-SV40EGFP, and pTKhygro as selection plasmid, to give tertiary transfectants. 
After 24 hours, cells were transfected again with either plasmid and pSVdhfr, resulting 
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in the quaternary transfectants which were selected in growth medium containing 500 
ng/ml G-418 and 5jAg/ml puromycin, 300 |ig/ml hygromycin B and 5\\M methotrexate. 
The secondary transfectants initially exhibited a GFP fluorescence of 8300. The fold 
increases correspond to the ratio of fluorescence obtained from two consecutive 
5 transfections as compared to the sum of fluorescence obtained from the corresponding 
independent transfections. The fold increases that were judged significantly higher are 
shown in bold f and correspond to fluorescence values that are 2-fold higher than the 
addition of those obtained from the independent transfections. 

10 These results show that loading more copies of pMAR or pMAR-SV40EGFP resulted in 
similar 2-fold enhancements of total cell fluorescence. Loading even? more of the MAR 
in the quaternary transfection further enhanced this activity by another 2.4-fold. This is 
consistent with our hypothesis that newly introduced MAR sequences may integrate at 
the chromosomal transgene locus by homologous recombination and thereby further 

15 increase transgene expression. 

When the cells were transfected a third and fourth time with the pMAR-SV40EGFP 
plasmid, GFP activity further increased, once again to levels not expected from the 
addition of the fluorescence levels obtained from independent transfections. GFP 

20 expression reached levels that resulted in cells visibly glowing green in day light 

(Fig. 14). These results further indicate that the efficiency of the quaternary transfection 
was much higher than that expected from the efficacy of the third DNA transfer, 
indicating that proper timing between transfections is crucial to obtain the optimal gene 
expression increase, one day being preferred over a three weeks period. 

25 We believe that MAR. elements favour secondary Integration events in increasing 
recombination frequency at their site of chromosomal integration by relaxing closed 
chromatin structure, as they mediate a local Increase of histone acetylation (Yasui, D., 
et al. v SATB1 targets chromatin remodelling to regulate genes over long distances. 
Nature, 2002. 419(6907): p. 641-5.]. Alternatively, or concomitantly, MARs potentially 

30 relocate nearby genes to subnuclear locations thought to be enriched in trans-acting 
factors, including proteins that can participate in recombination events such as 
topoisomerases. This can result in a locus in which the MAR sequences can bracket 
the pSV40EGFP repeats, efficiently shielding the transgenes from chromatin-mediated 
silencing effects. 

35 
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CLAIMS 

1 . A purified and isolated DNA sequence having protein production increasing 
activity characterized in that said DNA sequence comprises 

5 a) at least one bent DNA element, 

b) and at least one binding site for a DNA binding protein, 

2. The purified and isolated DNA sequence of claim 1 , characterized in that said 
bent DNA element is a MAR nucleotide sequence selected from the group comprising 
the sequences SEQ ID Nos 1 to 23, a sequence complementary thereof, a part thereof 
sharing at least 70% nucleotides in length, a molecular chimera thereof, a combination 
thereof and variants. 

3. The purified and isolated DNA sequence of claim 1 . charactered in that said 
bent DNA element is a cLysMAR element and/or fragment, a sequence complementary 
thereof, a part thereof sharing at least 70% nucleotides in length, a molecular chimera 
thereof, a combination thereof and variants. 

4. The purified and isolated DNA sequence of claim 3. characterized in that said 
20 part thereof is a nucleotide sequence selected from the B, K and F regions. 

5. The purified and isolated sequence of claims 1 to 4, characterized in that said 
DNA binding protein is a transcription factor. 

25 6. The purified and isolated sequence of claim 5, characterized in that the 
transcription factor is selected from the group comprising the polyQpolyP domain 
proteins. 



15 



30 



7. The purified and isolated sequence of claim 5, characterized in that the 
transcription factor is selected from the group comprising SATB1 , NWIP4 MEF2 S3 
DLX1, FREAC7, BRN2, GATA 1/3, TATA, Bright, MSX or a combination of two or more 
of these transcription factors. 
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8. A method for identifying a MAR sequence using a Bioinformatic tool comprising 
the computing of values of one or more DNA sequence features corresponding to DMA 
bending, major groove depth and minor groove width potentials, melting temperature. 

9. The method according to claim 8, characterized in that said Bioinformatic tool 
contains algorithms recognising profiles, based on dinucleotides weight-matrices, to 
compute values for one or more of said DNA sequence features corresponding to DNA 
bending, major groove depth and minor groove width potentials, and melting 
temperature. 



10 



10. The method according to claim 9, characterized in that said Bioinformatic tool 
computes values for all of said DNA sequence features. 



11. The method according to claim 10, characterized in that said Bioinformatic tool is 
15 SMAR Scan. 

12. The method according to claim 8-1 1 , characterized in that the identification of 
one or more DNA sequence features further comprises a feature corresponding to one 
or more binding sites for DNA binding proteins. 

20 

1 3. The method according to claim 1 2, characterized in that said DNA binding 
protein is a transcription factor. 

14. The method according to claim 13, characterized in that the transcription factor is 
25 selected from the group comprising polyQpolyP domain proteins or transcription 

factors. 

15. The method according to claim 1 2, characterized in that the DNA binding protein 
is selected from the group comprising SATB1, NMP4, MEF2, S8, DLX1. FREAC7, 

30 BRN2, GATA 1 /3, TATA, Bright, MSX or a combination of two or more of these 
transcription factors. 

16. The method according to claims 8-15, characterized in that values for the 
identification of DNA bending are comprised between 3 to 5 °. 
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17. The method according to claim 16, characterized in that values for the 
identification of DNA bending are comprised between 3.8 to 4.4 

1 8. The method according to claims 8-1 7 characterized in that values for the 

5 identification of the major groove depth are comprised between 8.9 to 9.3 A and values 
for the identification of minor groove width are comprised between 5.2 to 5.8 A. 

19. The method according to claims 18, characterized In that values for the 
identification of major groove depth are comprised between 9.0 to 9.3 A and values for 

10 the identification of minor groove width are comprised between 5.4 to 5.7 A. 

20. The method according to claims 8-19, characterized in that the melting 
temperature is comprised between 55 to 75 0 C. 

15 21 . The method according to claim 20, characterized in that the melting temperature 
is comprised between 55 to 62 ° C. 

22. The use of a purified and isolated DNA sequence comprising a first isolated 
matrix attachment region (MAR) nucleotide sequence which is a MAR nucleotide 

20 sequence selected from the group comprising the sequences SEQ ID Nos 1 to 23 and 
a cLysMAR element and/or fragment, a sequence complementary thereof, a part 
thereof sharing at least 70% nucleotides In length, a molecular chimera thereof, a 
combination thereof and variants for increasing protein production activity in a 
eukaryotic host cell. 

25 

23. The use of the purified and isolated DNA sequence of claim 22, characterized in 
that said purified and isolated DNA sequence further comprises a promoter operably 
linked to a gene of interest. 

30 24. The use of the purified and isolated DNA sequence of claims 22-23, 

characterized in that said purified and isolated DNA sequence further comprises at 
least a second isolated matrix attachment region (MAR) nucleotide sequence which is a 
MAR nucleotide sequence selected from the group comprising the sequences SEQ ID 
Nos 1 to 23 and a cLysMAR element and/or fragment, a sequence complementary 

35 thereof, a part thereof sharing at least 70% nucleotides in length, a molecular chimera 
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thereof, a combination thereof and variants for increasing protein production activity in a 
eukaryotic host cell. 

25 The use of the purified and isolated DNA sequence of claim 24 } characterked in 
5 that said first and at least second MAR sequences are located at both the 5' and the 3' 
ends of the sequence containing the promoter and the gene of interest 

26. The use of the purified and isolated DNA sequence of claim 24 v characterized in 
that said first and or at least second MAR sequences are located on a" sequence 

10 distinct from the one containing the promoter and the gene of interest. 

27. Theuse of the purified and isolated DNA sequence of any of claims 22-26, 
characterized in that said purified DNA sequence is in the form of a linear DNA 
sequence as vector. 



15 



28. A method for transfecting a eukaryotic host cell, said method cbmprising 



a) introducing into said eukaryotic host cell at least one purified DNA sequence 
comprising at least one DNA sequence of interest and/or at least one.purified and 

20 isolated DNA sequence consisting of a MAR nucleotide sequence or other chromatin 
modifying elements, 

b) subjecting within a defined time said trahsfected eukaryotic host cell to at least one 
additional transfection step with at least one purified DNA sequence comprising at least 
one DNA sequence of interest and/or with at least one purified and isolated DNA 

25 sequence consisting of a MAR nucleotide sequence or other chromatin modifying 
elements 

c) selecting said transfected eukaryotic host cell. 

29. The method of claim 28, characterized in that said DNA sequence of interest is 
30 a gene of interest coding for a protein operably linked to a promoter. 

30. The method of claim 29. characterized in that the selected trarisfected eukaryotic 
host cells are high protein producer cells with a production rate of at least 10 pg per cell 
per day. 
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31 . The method of claims 28-30, characterized in that the MARnucleotide is 
selected from the group comprising the sequences SEQ ID Nos 1 io 23 and a cLysMAR 
element, and/or fragment, a sequence complementary thereof, a part thereof sharing at 
least 70% nucleotides in length, a molecular chimera thereof, a combination thereof 

5 and variants. 

32. The method of claims 28-31 , characterized in that the defined time corresponds 
to intervals related to the cell division cycle. 

10 33. The method of claim 32, characterized in that the defined time is the moment the 
host cell just has entered into a second cell division cycle. 

34. A purified and isolated DMA sequence identified according to claims 8 to 2 1 . 

15 35. The purified and isolated DNA sequence of claim 34, having matrix attachment 
region (MAR) activity comprising a sequence selected from the sequences SEQ ID Nos 
1 to 23, a sequence complementary thereof, a part thereof sharing at least 70% 
nucleotides in length, a molecular chimera thereof, a combination thereof and variants. 

20 36. The purified and isolated DNA sequence of claim 35, having matrix attachment 
region (MAR) activity comprising a sequence selected from the sequences SEQ ID Nos 
21 to 23, a sequence complementary thereof, a part thereof sharing at least 70% 
nucleotides in length, a molecular chimera thereof, a combination thereof and variants. 

25 37. A purified and isolated cLysMAR element and/or fragment having protein 
production increasing activity, a sequence complementary thereof, a.part thereof 
sharing at least 70% nucleotides in length, a molecular chimera thereof, a combination 
thereof and variants. 

30 38. The cLysMAR element and/or fragment of claim 37 consisting of at least one 
nucleotide sequence selected from the B, K and F regions. 



35 



39. A synthetic MAR sequence consisting of natural MAR element and/or fragments 
assembled between linker sequences. 
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40. The synthetic MAR sequence of claim 39, characterized in that the MAR is 
cLysMAR. 

41 . The complete matrix attachment region of claim 39 to 40. characterized in that 
5 the linker sequences are Bgll-BamHI linker. 

42. A process for the production of a protein wherein 

a) a eukaryotic host cell transfected according to claim 28 to 33, is cultured in a 
culture medium under conditions suitable for expression of said protein and 
10 b) said protein is recovered. 

43. A eukaryotic host cell transfected according to any one of claims 28 to 33. 

44. A cell transfection mixture or kit comprising at least one purified and isolated 
1 5 DNA sequence according to claims 1 to 7 and/or 34 to 38. 

45. A transgenic organism characterized in that its genome has stably integrated at 
least one DNA sequence according to claims 1 to 7 and 34 to 38. and/or that at least 
some of its cells have been transfected according to the method of claims 28-33. 



20 



46. The use of the purified and isolated DNA sequences of claim 34 as MAR 
sequences having protein production increasing activity. 
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The present invention relates to purified and isolated DNA sequences having protein 
production increasing activity and more specifically to the use of matrix attachment 
regions (MARs) for increasing protein production activity in a eukaryotic cell. Also 
disclosed* a method for the identification of said active regions, in particular MAR 
nucleotide sequences, and the use of these characterized active MAR sequences in a 
new multiple transfection method. 
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SEQUENCE LISTING 



<110> Selexis S.A. 



<120> High Efficiency Gene Transfer and expression inMaxnmalian 

Cells by a Multiple Transfection Procedure of MAR Sequences 

<130> SEL EP 003 

<160> 23 

<170> Patentln version 3.1 

<210> 1 

<211> 320 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_binding 

<223> MAR* of human chromosome 1, nt from 3 6686 to 37008 



<400> 1 

ttatattatg ttgttatata tattatatta 
tt 60 

atataataat attatattat atattatata 
ta 120 

taattatata ttacattata taatatataa 
ta 180 

taatatataa taatattata taataatata 
ta 240 

atattatata atattatata atatataaat 
ta 300 

gtatataata ttatataata 
320 - 



tgttattaga ttatattatg ttgttata 
ttatattata -taatatataa taatatta 
taatattata taattatata ttacatta 
taattatata atatataata atattata 
atataataat atatattata ttatataa 



<210> 2 

<;211> 709 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> miscjDinding 
<222> (1) . . (709) 
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<223> MAR of human chromosome 1, nt from 142276 to 142984 



<400> 2 

tacaatatat tttctattat atatattttg tattatatat aatatacaat atattttc 
ta 60 

ttatatataa tatattttgt attatatata ttacaatata ttttgtatta tataatat 
at 120 

aatacaatat ataatatatt gtattatata ttatataata caatatatta tatattgt 
at 180 

tatatattat atataatact atataatata ttgtattata tattatatat aatactat 
at 240 

aatatatttt attatatatt atatataata caatatataa tatattgtat tataatac 
aa 300 

tgtattataa tgtattatat tgtattatat attatatata atacaatata taataata 
ta 360 

ttataatata taataataat ataatataat aataatatat attgtattat atattata 
ta 420 

atacaatata taatatattg tattatatat attttattac atataatata taatacat 
ta 480 

tataatatat tttgtattat atataatata ttttattatg tattatagat aatatatt 
tt 540 

attatatatt atatataata caatatataa tatattttgt attgtatata atatataa 
ta 600 

caatatataa tatattgtat tatatataat attaatatat tttgtattat atatttat 
at 660 

tttatattat aattatgttt tgcattatat atttcatatt atatatacc 
709 



<210> 3 

<211> 409 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> m±sc_binding 

<222> (1) . . (409) 

<223> MAR of human chromosome 1, nt from 1368 659 to 1369067 



Seite 2 



06/02/2064 18: 22 +41-61-225-97-99 



COGIT 
Mar.st25 



S. 32 



<400> 3 

tacacataaa tacatatgca tatatattat gtatatatac ataaatacat atgcatat 
ac 60 

attatgtata tatacataaa tacatatgca tatacattat gtatatatac ataaatac 
at 120 

atgcatatac attatgtata tatacataaa tacatatgca tatacattat gtatatat 
ac 180 

ataaatacat atgcatatac attatgtata tatacataaa tacatatgca tatacatt 
at 240 

gtatatatac ataaatacat atgcatatat tatatacata aattatatta tatacata 
at 300 

acatatacat atattatgtg tatatataca taaatacata tacatatatt atgtgtat 
at 360 

atacatgata catatacata tattatgtat atatatacat aaatacata 
409 



<210> 4 

<211> 394 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc binding 

<222> (1) .7(394) 

<223> MAR of human chromosome 1, nt from 2839089 to 2839482 



<400> 4 

tatgtatata tacacacata tgtatatata cacacatatg tatatacgta tatatgta 
ta " 60 

tatacacaca tatgtatata cgtatatatg tatatataca cacalatgta tatacgta 
ta 120 

tatgtatata tacacacata tgtatatacg tatatatgta tatatacaca catatgta 
ta 180 

tatgtatata tacacacata tgtatatacg tatatatgta tatatacaca catgtgta 
ta 240 

tatatataca catatgtata tatgtatata tacacacata tgtatatatg tgtatgta 
ta 300 

tatacacaca tatgtatata tacacatata tatgtatata tacacacata cttatata 
ta 350 
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cacatatata tgtatatata cacatatgta taca 
394 



<210> 5 

<211> 832 

<212> DNA 

<213> Homo sapiens 

<220> 

<2 2 1> misc_binding 

<222> (1)..(832) 

<223> MAR of human chromosome 1, nt froia 1452269 to 1453100 



<400> 5 

tatattacta tatatacaat atacatatta ctatatatac catgtattac tatatata 
to 60 

tactatatat attactatat atacaaaata tatattacta tatatacaat atacatat 
ta 120 

ctatatatac catatattac tatatatatc tactatatat attactatat atacaaaa 
ta 180 

tatattacta tatatactat atattactgt atatacaata tatatt-acta tatatata 
ct 24 0 

atatattact atatatacac tatatattac tatatataca caatatatat attactat 
at 300 

atacacaatg tatataacta tatatacaat atatattact atatatacta tatatatt 
ac 360 

tatacatact atatattact ctatatatac aatatatata ttacaatata tactacat 
at 420 

tactacatat actttatata ttactatata tactatatat tactgtatat acaatata 
ta 480 

ttactaaata tacacaatat atattactat atatacacaa tatatata tt actatata 
ta 540 

cacattatat atgactatat atacacacta tatatattac tatatataca caatatat 
aa 600 

ctatatatac acagtataca tattactata tatacacaat atatatatta ctatatat 
ac 660 

actatatatt actatatata cacaatatat attactctat gtatacacta tatatatt 
ac 720 
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tatatataca gaatatatat aactatatat acactatatt actatatata ctatatat 
ta 780 

ctatatgtac tatatatatt actatatata ctatatatta ctatatatac ac 
832 



<210> 6 

<211> 350 

<212>. DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 

<222> (1) . • (350) 

<223> MAR of human chromosome 1, nt from 831495 to 831844 



<400> 6 

aatatataat atataaatat taatatgtat 
at 60 

attactatat aaataatatt aatatattat 
aa 120 

atattatatt aattaaatat taataaatat 
ct 180 

ataacatatg catatactta tttatatata 
ta 240 

tatatttata tattatataa tatattatat 
ta 300 

tgtatttata tattatatat catataatat 
350 



tatataatat atattaatat attatatt 
attaaaatat taataaatat atcatatt 
attatattaa tatatttata tattaaac 
acatgca^gt act tat t tat atatacaa 
gtatttatat attatatatc atatatta 
atatatttat attatatata 



<210> 7 

<211> 386 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mi a c — binding 

<222> (1)..(386) 

<223> MAR of human chromosome 1, nt from 1447225 to 1447 



<400> 7 

acatttaatt taattatata ctgctatata taattaaatc tatatatcta tataactt 



r l :*.nc/no/onn/i 17-97 
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aatttatttt aatttaatta 
at 120 

atagtataat tatagtatat 
at 180 

atatactata tatttataca 
at 240 

atgtacatat ggcatatatt 
at 300 

atgtaaatat atagtacata 
gt 3 60 

gtattatagt acatatttta 
386 



tatatactat atagttatat 
atgtatatat aatgtaagta 
tatgtcttta tatatactaa 
ttatagtgta tatatacata 
tttaattata tggtaatata 
tagtat 



atacatatat gtaattat 
aatatatagt atatattt 
tatatataca catatgta 
tatgtaatat atatagta 
tacacatata tgtaatat 



<210> 8 

<211> 585 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> rnisc_binding 

<222> (1) . . (585) 

<223> MAR of human chromosome 1, nt from 4955365 to 4955949 



<400> 8 

atacacacat atacacatat gtacgtatat 
tg 60 

tacgtatata tactatatat acacacatat 
ta 120 

cacacatata cacatatgta cgtatatata 
ac 180 

gtatatatac tatatataca cacatataca 
ac 240 

atatacacat atgtacgtat atatactata 
ta 300 

tatactatat atacacacat atacacatat 
.ta 360 

tacacatatg tacgtatata tactatatat 



atactatata tacacacata tacacata 
acacatatgt acgtatafcat actatata 
ctatatatac acacatatac acatatgt 
catatgtacg tatatattat atatacac 
tatacacaca tatacacata tgtacgta 
gtacgtatat atactatata tacacaca 
acacacatat acacatatgt acgtatat 
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at 420 

actatatata cacacatata cacatatgta cgtatatata ctatatatac acacatat 



acatatgtac gtatatatac tatatataca cacatataca catatgtacg tatatata 
ct 540 

atatataccc atacacatac gtatatacgt acatatatat acgta 
585 



<210> 9 

<211> 772 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 

<222> CD-- (772) 

<223> MAR of human chromosome 1, nt from 5971862 to 5972633 



<400> 9 

agtaaacata tatatagtaa atatatatag tgtatatata gtaaat.atat atagtgca 
ta 60 

tatatagtgc atatatatag tgtatatata gtaaatatat agtgtrftata tatagtaa 
at 120 

atatatagtg tatatatagt aaatatatat agtaaatata tatatactat atatagta 
aa 180 

tatatatata ctatatatag taaatatata tatagtatat atatagtaaa tatatata 
ta 240 

gtatatatat agtaaatata tatatagtat atatatagta aatatatata tagtatat 
at 300 

agtaaatata tatagtatat atatagtaaa tatatatata gtatatatat agtaaata 
ta 360 

tatatagtat atatatagta aatatatata tagtatatat atagtaaata tatatagt 
at 420 

atatatagta aatatatata gtatatatat agtaaatata tatagtatat atatagta 
aa 480 

tatatataca ctgtatatat atagtaaata tatatacact gtatatatat agtaaata 
ta 540 

tatacactgt atatatatag taaatatata tacactgtat atatatagta aatatata 



ac 



480 
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ta 600 

caetgtatat acatagtaaa tatatataca ctgtatatac atagtaaata tatataca 
ct 660 

gtatatacat agtaaatata tatacactgt atatacatag taaatatata tacagtgt 
at 720 

atacatagta aatatatata cagtgtatat acatagtaaa tatatataca gt 
772 



<210> 


10 


<211> 


304 


<212> 


DNA 


<213> 


Homo 


<220> 




<221> 


misc 


<222> 


(1) 


<223> 


MAR 


<400> 


10 


atatataata 


ta 


60 


ttatatataa 


ta 


120 


tatatataat 


ta 


180 


atatatataa 


ta 


240 


atatatataa 


at 


300 


atta 






304 


<210> 


11 


<211> 


311 


<212> 


DNA 



<213> Homo sapiens 
<220> 

<221> misc_binding 

<222> (1)..(311) 

<223> MAR of human chromosome l r nt from 9418531 to 9418841 
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<400> 11 

tatatataat atttatatat aatattcatg 
ta 60 

tataaatatt tatatattta tatataaata 
• at 120 

tatatataat atttatatat tatatataat 
at 180 

aatatttata tatttatatg tataatatat 
tt 240 

tatatatgta tgtataatat attttatata 
ta 300 

tataatttat a 
311 



tatttatata taaatattta tatattta 
tttatatatt tatatataat atttatac 
atttatatat aatatttata tattatat 
attttatata tgtatgtata atatatat 
tgtatgtata atatattatt atatataa 



<210> 12 

<211> 302 

<212> DNA 

<213> Homo sapiens 



<2Z0> 

<2Z1> miso_binding , 

<Z22> ~(1) rT(302) 

<223> MAR of human chromosome 1, nt from 15088789 to 15089090 



<400> 12 

atataatata tatattatat atataaatat 
at 60 

aaatatatat aaatatataa catatatatt 
ca 120 

tatatattat atatataaat atatataaat 
at 180 

atattatata tttatatata taatatatat 
at 240 

atatataaat atataatata tatatttata 
at 300 



at 

302 



atataaatat ataacatata tattatat 
atatatataa atatatataa atatataa 
atataacata tatat tatat atataaat 
aaatatataa tatatattta tatatata 
tataatatat ataaa tatat aatatata 
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<210> 13 

<211> 461 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc binding 
<222> (1>..(461) 

<223> MAR of human chromosome 1, nt from 6791827 to 67 92287 
<400> 13 

tatataatat atattatata tacacatata taatatatat tatatataca catatata 
atatattata tatacacata tataatatat attatatata cacatatata atatatat 
cS tata aSS S tatataatat atattatata tacacatata taatatatat tatatata 
catatataat atatattata tatacacata tataatatat attatatata oacatata 
?a 3tata oS a tai;atacaca tatgtaatat atattataca cacauatata atatatat 
JS^HZSSq I tataatata * attatatata catatataat atatattata tatacaca 
tataatatat attatatata cacatatata atatatatta tatatacaca tataatat 

aatatataca catatataat atatatatta tatatgcaca t 
4 SI 

<210> 14 

<2H> 572. 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 
<222> (1) .7(572) 

<223> MAR of human chromosome 1, nt from 163530 to 164101 
<400> 14 

atattataat tatatatatt atatataatt atataaaata tatattataa ttatatat 
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tttatataat atatatatta taattaatat 
at 120 

atatatatta tatatattat atataatata 
ta 180 

tatattatat ataatatata atatatataa 
ta 240 

atataatata tataatatat aatataatat 
at 300 

aatatataat atatataata tataatataa 
ta 360 

atatatataa tatataatat aatatatata 
at 420 

atatttaata tatttattaa ttatttgtta 
ta 4B0 

tttaatatat tataactata tattatatta 
at 540 

aattatatat tatatatact tataatatat 
572 



attatatata atatatatat tatatata 
tataatatat ataatatata atataata 
tatattataa tataatatat ataatata 
ataatatata atatatataa tatataat 
tatataatat atataatata ttataata 
atatataata taatatataa tatataat 
tatatttatt aatatataat atataata 
taattatata tattatatat atacaatt 
at 



<211> 357 

<212> DNA 

<213> Homo sapiens 

<220> 

<ZZ1> TCiisc__binding 

<222> (1)..(357) lBJ „ 00 

<223> MAR of human chromosome l r nt from 1842332 to 1842688 



<400> 15 

tatatctata tatatctata 
at 60 

aatattatct atatataata 
ct 120 

atatataaaa ttatattata 
ta 180 

tataatatag ataatatcta 
tt 240 



tatatataat atagataata 
tagataatat tatctatata 
tctatatata ttatatatat 
tatataaata gataatatct 



tctatatata taatatag 
taatatagat aatattat 
aaaattatat tatatcta 
atatatataa tatagata 
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atctatatta tagatataga taatattatc tatattatag atattatcta tatataat 
: agataatatt atctatatta tatatataat atatctatat tatctataat attatct 

<210> 16 

<211> 399 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 
<222> (1)..(399) 

<223> MAR of human chromosome 1, nt from 2309560 to 2309958 
<400> 16 

attatatata atatatatta tatattatat atatcaagca gcagatataa tatataat 

3.XZ. 6 0 

atataatata tataatatat attgtatatt atataatata taatatatat aatatata 

JL V 

gtatattata taatatataa tatatataat atatattgta tattatataa tatataat 

9t X 8 0 

^atatraatatwat^^ bgfcaatat 1 1 1 "» 

at 240 ' 

tatataatat atattatata ttatatataa tatatattat atataatata tattacat 
aa 300 

tatattacat atattacgta atatatgtta tatattacat ataatatata acatatat 
ta 360 

cgtaatatat gtaatatatt acatataata ta'tacatta 
399 

<210> 17 

<211> 394 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 

<222> (1)..(394) 

<223> MAR of human chromosome 1, nt from 2231759 to 2232152 
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<400> 17 

atatatactt ataaattata tacttatata tacttataaa ttatatactt atatatac 
tt 60 

ataaattata tacttatata tacttataaa ttatatactt- atatatactt ataaatta 
ta 120 

tacttatata tacttataaa ttatatactt atatatactt ataaattata tacttata 
ta 180 

tacttataaa ttatatactt atatataatt ataaattata tacttatata taattata 
aa 240 

ttatatactt atatataatt ataaattata tacttatata taattataaa ttatatac 
tt 300 

atatataatt ataaattata tacatatata taattataaa ttattitacat atataatt 
at 360 

aaattatata catatataat tataaattat atac 
394 



<210> 18 

<211> 3B7 

<212> UNA 

<213> Homo sapiens 



<220> 

<221> misc_binding 

<222> (1) . . (387) 

<223> MAR of human chromosome 1, nt from 7406524 to 7406910 



<400> 18 

tatattatat ataatatata ttatatataa tataaataat atatattata tataatat 
at 60 

aaataatata taatatataa ataatatata atatataata tataaataat atataata 
ta 120 

taacatataa ataatatata taatatataa ataatatata taatatataa ataatata 
ta 180 

taatatataa aaatatataa tatataatac atatataaat aatatattat attatata 
tg 240 

atacataata tattatatat aatatattat atgatacata atatattata tagaatat 
at 300 

tatatgatac ataatatatt atatagaata tattatatga tacataatat attatatg 
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at 



360 



acataatata ttatatataa tatatta 
387 



<210> 19 

<211> 370 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__bind.ing 

<222> (1)..(370) 



<223> MAR of. human chromosome 1, nt from 9399572 to 9399941 



<400> 19 
catatataca 
ta 60 

tatacacata 
ca 120 

cacatatata 
ca 180 

tatatacaca 
ac 240 



tatatacaca tatatacaca tatatataca cataeafcatg tacacata 
tgtatacaca tatatacaca tatatacaca catatataca catatata 
cacatatata cacatatata cacatataca catatataca catatata 
tatatataat atacacacat atatatacac atatatacac acatatat 



acatatatac 
at 300 

acatatatac 
ac 360 

atacatatac 
370 



acatatatat acacatatat acacatatat acatatatac acatatat 
acatatatac atatatacac atatatacat atatacacac atatatac 



<210> 20 

<2ll> 377 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_binding 
<222> (1)..{377) 



<223> MAR of human chromosome 1, nt from 12417411 to 12417787 



<400> 20 
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attatatata atacatataa ttatatattt 
tt 60 

atatatttat atataaatta tatataataa 
ta 120 

taataaatac atataattac atatatttat 
aa 180 

ttatatatat ttatatgtag attatatata 
at 240 

atataattta tatatataat tatatatata 
ta 300 

tatatataat aaatatataa- taatatatat 
aa 360 

atatatataa tttatat 
377 



atatataaat tataataaat acatataa 
atacatataa ttacatatat ttataaat 
atatgaatta tatataataa atacatat 
aatatatata atttatatat ataataat 
ataaatatat ataatttata tatataat 
aatttatata tataattata tatataat 



<210> 21 

<211> 1524 

<;212> DNA 

<213> Homo sapiens 



<220> 

<221> misc binding 

^22 2:>*^*1>US^&5 2Afl * "* ■ ■ 1 " [ 1 1 ' IJ ' ' rTTTTn 

<223> MAR of human chromosome 1, nt from 1643307 to 1644830 



<400> 21 

tataaatata tataaatata taaatatata 
ta 60 

aatatataaa aatatataaa tatatataaa 
ta 120 

tatataaata tatataaata tataaaaata 
ca 180 

aatatataaa tatatacata aatatatata 
ta 240 

aatatataaa tatatataaa tatatataaa 
aa 300 

tatataaata tataaaaata tatataaata 
ta 360 



taaatatata aatatatata aatatata 
tatatataaa tatataaaaa cataaaaa 
tataaatata taaatatata aaaatata 
aatatatata aatatataaa aatatata 
tatatataaa tatataaaaa tatatata 
tataaatata taaatatata taaatata 
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aatatataaa taaatataag tatttatgaa tatatatgaa tatataaata tataaaaa 

" & U 

atatataaat atataaatat atataaatat ataaatatat acatatatac atatataa 
aaataaatat aagtatttat gaatatatat gaatatataa atatataaaa aatatata 

"Co 0 A U 

aatatataaa tatatataaa tataaatata taaaaatata taaaaatata tataaata 
taaatatata taaatatata aatatatata aatatatata aatatataaa tatatata 

3d 660 

tatatataaa tatataaata tataaatata tataaatata tataaatata taaatata 
ta 720 

aatataaata tataaatata tataaatata tataaatata taaatatata taaatata 

C3 V 8 0 

taaatatata taaatatata taaatatata aatatatata aatatatata taaatata 
taaatatata aatatataaa tatataaaaa tatataacaa tatataaata tatataaa 

3.3. 900 

tatataacaa tatataaata taaatatata taaaaatata taacaatata taaatata 
aa 960 " 



aa t3t 1020 a tatataaata taa atataaa aaatatatat aaatatataa atatatat 

atatataaat gtataaatat atataaaaat atataacaat atataaatat ataaatat 
at X080 

aacaatatat aaatatataa aaatatataa caatatataa atataaatat atataaaa 
atataacaat atataaatat aaatatatat ataaatatat aaatataaat ataaaaaa 
tr at 12S0 a tataaatata tatataaata tatataaata tataaatgta taaatata 
taaatatata aatatataaa aatatataaa tatatataaa tatatataaa tatataaa 
tI aat 13 a S a aatatatata aatatataaa tataaatata taaacatata taaatata 
taaataaaca tatataaaga tatataaaga tataaagata tataaatata taaatata 
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ta 1440 

aagatatata aatatataaa gatatataaa tatataaaga tatataaata tataaaga 
ta 1500 

tataaatata atatataaat atat 
1524 

<210> 22 

<211> 664 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_binding 
<222> (1) . , (664) 

<223> MAR of human chromosome 1, nt from 13987 63 to 1399426 
<400> 22 

acacatatat atataaaata .tatatatata cacacatata tata^aatat atatatat 
ac 60 

acacatatat ataaaatata tatatacaca catatatata aaatatatat atacacac 
at 120 

atatataaaa tatatatata cacacatata tataaaatat atatatacac acatatat 

at 180 „ 



aaaatatata tatacacaca tatatataaa atatatatat acacacatat atataaaa 
ta 240 

tatatataca cacatatata taaaatatat atatacacac atatatataa aatatata 
ta 300 

tacacacata tatataaaat atatatatac acacatatat aaaatatata tatacaca 
ca 360 

tatataaaat atatatatac acatatatat aaaatatata tatacacata tatataaa 
at 420 

atatatacac acatatatat aaaatatata tatacacaca tatatataaa atatatat 
at 480 

acacatatat ataaaatata tatatacaca tatatataaa atatatatat atacacat 
at 540 

atataaaata tatatacaca catatatata aagtatatat atacacacat atatataa 
aa 600 

tatatatata cacatatata taaaatatat atatacacat atatataaaa tatatata 
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ta 660 
caca 

664 



<210> 23 

<211> 1428 

<212> DNA 

<213> Homo sapiens 

<220> 

<22l> misc_binding 
<222> (1) . - (1428) 

<223> MAR of human chromosome 2 r nt from 17840365 to 178417 92 
<400> 23 

aatttattat atattatata ttatatatat tatatatatt atatat.tata tatattat 
at 60 

.atattatata ttatatatat tatatattat atatttatat ataatatata tctaatat 
at 120 

atattagata taatatatat ctaatatata tatattttat atatataata tatctcta 
at 180 

atatatattt tatatgtata taatatatct ctaatatata tatattttat atgtatat 

aa 240 . . - . .... 



tatatctcta atatatatat tttttatata taatatatct ctaatatata tattttat 
at 300 

atataatata tatctaatat atataatata tatattagat atatataaaa tatatatg 
at 360 

atatttatta tatatataat atataatata taatatatat attatattat atacatat 
at 420 

attatataca atatatatta tatatatttt atatacatta tatattatat atatttta 
ta 480 

tacaatatat attatatatt ttatatacaa tatatattat atatatttta tattttta 
ta 540 

tacaatatat attatatata ttttatatat aatatatatt atatatattt tatataat 
at 600 

atattatata tattttatat ataatatata ttatataaat tatatataat atatatta 
ta 660 

ataaattata atatttttta tatatataat atgtatttta tatataatat attataat 
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at 720 

atattttata tataatatat tataatatat attttatata taatatatta taatatat 
at 780 

tttatattat aatatattat aatatatatt ttatatataa tatatitataa tatatatt 
tt 840 

atatataata tattataata tatattttat atataatata ttataatata tatattat 
aa 900 

tatatatttt atatataata tattatcata tatatattaa atatatattt tatatata 
at 960 

atattataat atatatatta taatatatat tttatatata atatattata atatatat 
at 1020 

tataatatat attttatata taatatatta taatatatat tttatatata atatatta 
ta 1080 

atatatattt tatatataat atattataat atatatttta tatataetat aatatata 
tt 1140 

ttatatataa tatattataa tatatatttt atatataata tattataata tatatttt 
at 1200 

atataatata ttataatata tattttatat ataatatatt ataatatata ttttatat 
at 1260 

„ ^aatatat-tat~aatatat att 'ttatat*at^a^€afr^ tatatatttt atatataa 

ta 1320 

tattataata tatattttat atataatata ttaattaaat ttattaattt attaatta 
tt 1380 

aatatttatt atattattaa ttaataatat ataaattatt aatatata 
1428 



_ t 



x.ne/no/onn/i n«on 
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